Skip to content

feat: avoid storing whole decapsulation key for mlkem#48

Open
mmalenic wants to merge 1 commit into
mkj:mainfrom
mmalenic:feat/mlkem-update
Open

feat: avoid storing whole decapsulation key for mlkem#48
mmalenic wants to merge 1 commit into
mkj:mainfrom
mmalenic:feat/mlkem-update

Conversation

@mmalenic
Copy link
Copy Markdown
Contributor

Related, and closes brainstorm/ssh-stamp#34

This PR updates mlkem to version 0.3.2, and refactors the KexMlkemX25519 to store a 64 byte seed, rather than storing the full decapsulation key.

Motivation

The primary benefit of doing this is that this allows storing a much smaller amount of stack data in the KexMlkemX25519 struct, which is important for embedded targets. The change reduces the size from about 3.2KB to only 64 bytes when that struct is held in-memory.

I had originally intended to submit this with mlkem 0.3.0-rc2, but it looks like there is now a proper non-rc version available. The new mlkem version allows making this change, as it has functions that avoid having to hold the whole decapsulation key.

Performance

The downside of this approach is that it has to re-compute the key every time it's request.

I think this has a minimal impact on performance though, which is why I have also made the mlkem feature enabled by default (let me know if this is not what you'd like). When I tested this in a benchmark, the client-side cost was slower by about 1.5-2x, however in the context of a session, this cost is only paid once, and the scale is very small (in ns only). On server-side, this code path isn't executed I believe.

In general on the server-side, the difference between having mlkem enabled and not is negligible. The majority of the time spent in ssh-stamp is on the wifi layer. Having mlkem on was nearly identical to having it off, with less than a few ms variation.

Given the benefit of enabling it, I think it's worthwhile. I can also submit some benchmarks for this if there is interest in this repo? While the ssh-stamp benchmarks are specific to that application, I do have a few function-only benchmarks for the mlkem code path.

Implementation

I've also replaced &mut rand_core::OsRng with random::fill_random(&mut m). I think this is the way to go based on the rest of the repo.

Would love to hear your thoughts!

/cc @brainstorm @jubeormk1 @Autofix

@mmalenic mmalenic force-pushed the feat/mlkem-update branch from f92de28 to 68feb8b Compare May 11, 2026 11:29
@mkj
Copy link
Copy Markdown
Owner

mkj commented May 11, 2026

Thanks Marko, I'll give it a look/test when I get a chance. Out of curiosity, do you have timings for mlkem on the ssh-stamp target, which ESP?

@mmalenic
Copy link
Copy Markdown
Contributor Author

I was experimenting with probe-rs to try and measure the execution time in a few locations in ssh-stamp. The boot time was about 2.3 seconds on my esp32c6. I also measured the time from the TCP connection being created, to the ServEvent::FirstAuth section, which should include the mlkem code path. That took about 300ms on esp32c6, with the mlkem version of the code being about 40ms slower.

I haven't benchmarked this very thoroughly yet though! I'll create some benchmarks that can be run and tested by others, and update here.

@mkj
Copy link
Copy Markdown
Owner

mkj commented May 12, 2026

Thanks. Not need to measure much, I was just curious. Will have a go on some boards here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add mlkem768x25519 (to sunset?) and integrate into SSH-Stamp

2 participants