feat: avoid storing whole decapsulation key for mlkem#48
Conversation
f92de28 to
68feb8b
Compare
|
Thanks Marko, I'll give it a look/test when I get a chance. Out of curiosity, do you have timings for mlkem on the ssh-stamp target, which ESP? |
|
I was experimenting with probe-rs to try and measure the execution time in a few locations in ssh-stamp. The boot time was about 2.3 seconds on my esp32c6. I also measured the time from the TCP connection being created, to the I haven't benchmarked this very thoroughly yet though! I'll create some benchmarks that can be run and tested by others, and update here. |
|
Thanks. Not need to measure much, I was just curious. Will have a go on some boards here. |
Related, and closes brainstorm/ssh-stamp#34
This PR updates mlkem to version 0.3.2, and refactors the
KexMlkemX25519to store a 64 byte seed, rather than storing the full decapsulation key.Motivation
The primary benefit of doing this is that this allows storing a much smaller amount of stack data in the
KexMlkemX25519struct, which is important for embedded targets. The change reduces the size from about 3.2KB to only 64 bytes when that struct is held in-memory.I had originally intended to submit this with mlkem 0.3.0-rc2, but it looks like there is now a proper non-rc version available. The new mlkem version allows making this change, as it has functions that avoid having to hold the whole decapsulation key.
Performance
The downside of this approach is that it has to re-compute the key every time it's request.
I think this has a minimal impact on performance though, which is why I have also made the mlkem feature enabled by default (let me know if this is not what you'd like). When I tested this in a benchmark, the client-side cost was slower by about 1.5-2x, however in the context of a session, this cost is only paid once, and the scale is very small (in ns only). On server-side, this code path isn't executed I believe.
In general on the server-side, the difference between having mlkem enabled and not is negligible. The majority of the time spent in ssh-stamp is on the wifi layer. Having mlkem on was nearly identical to having it off, with less than a few ms variation.
Given the benefit of enabling it, I think it's worthwhile. I can also submit some benchmarks for this if there is interest in this repo? While the ssh-stamp benchmarks are specific to that application, I do have a few function-only benchmarks for the mlkem code path.
Implementation
I've also replaced
&mut rand_core::OsRngwithrandom::fill_random(&mut m). I think this is the way to go based on the rest of the repo.Would love to hear your thoughts!
/cc @brainstorm @jubeormk1 @Autofix