-
Notifications
You must be signed in to change notification settings - Fork 22
ML-DSA signing scheme for TUF metadata #195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
0acf1fd
786619a
abecf81
b2f3167
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,190 @@ | ||
| * TAP: 21 | ||
| * Title: ML-DSA signing scheme for TUF metadata | ||
| * Last-Modified: 2026-04-30 | ||
| * Author: Fredrik Skogman | ||
| * Status: Draft | ||
| * Content-Type: text/markdown | ||
| * Created: 2026-04-29 | ||
|
|
||
| # Abstract | ||
|
|
||
| This TAP proposes an application-level pre-hashing scheme to use with | ||
| ML-DSA that minimizes the data sent to the signing device. Instead of | ||
| passing the full canonicalized metadata, the application computes a | ||
| cryptographic hash over the metadata and computes a pre-signing byte | ||
| string (including domain separator and protocol version). The | ||
| pre-signing byte string is what is sent to the signing device. This | ||
| approach keeps the HSM interface simple and bounded while preserving | ||
| the security properties required by FIPS 204. | ||
|
|
||
| # Motivation | ||
|
|
||
| TUF metadata can be large, particularly for targets metadata in | ||
| repositories with many artifacts. ML-DSA pure mode signing requires | ||
| the entire message to be available to the signing device. When private | ||
| keys are held in hardware security modules (HSMs), the HSM must | ||
| receive the full message to produce a pure mode signature. | ||
| Transmitting large metadata payloads to an HSM introduces practical | ||
| limitations on message size, and potential interface constraints that | ||
| make pure mode ML-DSA unsuitable for some TUF deployments. | ||
|
|
||
| # Specification | ||
|
|
||
| Conventions used: | ||
|
|
||
| * `0x__`: a raw byte value, specified as a hexadecimal number | ||
| * `||`: byte concatenation | ||
|
|
||
| The raw TUF metadata is NEVER signed, instead a pre-signing byte string is | ||
| created with the following format, offering domain separation and | ||
| versioning at the protocol level: | ||
|
|
||
| ``` | ||
| 0x74 || 0x75 || 0x66 || <version> || H(MSG) | ||
| ``` | ||
|
|
||
| The domain separators are the ASCII codes for `tuf`. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. love this :) |
||
|
|
||
| Version must be a single byte specifying the version. `0x01` is | ||
| currently the only defined version. | ||
|
|
||
| The hash function and the canonicalization scheme for the message are | ||
| specified by the version. | ||
|
|
||
| Pure ML-DSA MUST be used with an **empty context**. | ||
|
|
||
| ## TUF metadata parameters: | ||
|
|
||
| * `keytype`: `ml-dsa` | ||
| * `scheme`: (`ml-dsa-<parameter set>/<version>` where version is | ||
| encoded as a decimal number without leading zeros) | ||
| * `ml-dsa-44/<version>` | ||
| * `ml-dsa-65/<version>` | ||
| * `ml-dsa-87/<version>` | ||
| * `keyval.public`: PEM encoding of DER-encoded `SubjectPublicKeyInfo` | ||
| structures as defined for ML-DSA in RFC 9881 | ||
| * `signature.sig`: Hex-encoded signature byte string as per FIPS 204 | ||
| §7.2 | ||
|
|
||
| > [!NOTE] | ||
| > As of this publication only version 1 (`0x01`) is specified. Any | ||
| > other version must be rejected during signing or verification. | ||
|
|
||
| ## Rationale | ||
|
|
||
| Why not use the `scheme` to specify the hash algorithm and instead use | ||
| the version? The version specifies the entire set of choices and | ||
| cryptographically binds those choices to the signature which would | ||
| reduce risks of misuse. This allows for easier updates of the | ||
| versioning scheme as it's an all-or-nothing approach. By providing | ||
| some details via the scheme and others via some versioning opens up | ||
| for possible confusion. | ||
|
|
||
| Why not use HashML-DSA? With Ed25519 the ecosystem support has been | ||
|
kommendorkapten marked this conversation as resolved.
|
||
| much better for the pure version, and it's likely it will be the same | ||
| for ML-DSA. Based on this an application specific protocol is better | ||
| suited for wider adoption. Pre-hash algorithms are really not needed | ||
| either, and they can add more complexity, see [HashML-DSA considered | ||
| harmful](https://keymaterial.net/2024/11/05/hashml-dsa-considered-harmful/). | ||
|
|
||
| Certain implementations expose an API where μ is exposed directly to | ||
| the sign interface like [OpenSSL | ||
| 4.0](https://openssl-library.org/post/2026-04-14-openssl-40-final-release/), | ||
| however APIs like this are not guaranteed to be available for every | ||
| ecosystem, nor can we trust that each cryptographic provider | ||
| separates the μ computation to a different cryptographic module to | ||
| avoid large payloads to be transmitted to the signing device. | ||
|
|
||
| ## Protocol versions | ||
|
|
||
| To allow for future updates on hash algorithm selection to mitigate | ||
| any collision or preimage attack the selection of hash algorithm is | ||
| specified via a protocol version. This provides a layer of indirection | ||
| where certain details can change over time without encoding too much | ||
| information into the `scheme` parameter. | ||
|
|
||
| ### v1 | ||
|
|
||
| * Version byte: `0x01` | ||
| * Hash algorithm: SHA-512 | ||
| * Implementations MUST support | ||
| * `ML-DSA-44` (`scheme: ml-dsa-44/1`) | ||
| * `ML-DSA-65` (`scheme: ml-dsa-65/1`) | ||
| * `ML-DSA-87` (`scheme: ml-dsa-87/1`) | ||
| * Metadata canonicalization scheme: encoded as "canonical JSON" as described | ||
| in the [TUF | ||
| Specification](https://theupdateframework.github.io/specification/v1.0.34/index.html#metaformat). | ||
|
|
||
| ## Signature generation | ||
|
|
||
| 1. Load the public key from TUF metadata | ||
| 2. Parse the version from the public key's `scheme` and prepare the | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could change
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My thinking of keeping it is that we still have the hash decided per the version of the protocol. But you are right that for |
||
| hash function `H` | ||
| 3. Compute the canonicalized metadata representation `MSG` | ||
| 4. Create the pre-signing byte string: | ||
| ``` | ||
| 0x74 || 0x75 || 0x66 || version || H(MSG) | ||
| ``` | ||
| 5. Sign the pre-signing byte string using an empty context | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does it make sense for us to include advice around where to do the signing? (Like do we suggest using a hardware device?) I'm not sure if this makes sense to include, but maybe we can include a link or something to guide implementers in the right direction There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this should be out of scope for the TAP, as the decision to use an HSM vs SK vs KMS vs raw key is based on the integrating project's threat model. Maybe as a separate TAP if this guidance doesn't exist?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we need to sit tight here on any recommendation. I immediately see two different paths in the future:
So I think this TAP benefits from just defining the signing scheme. |
||
|
|
||
| ## Verification steps: | ||
|
|
||
| 1. Compute canonical metadata representation | ||
| 2. Load up the public key for verification | ||
| 3. Parse `scheme` into parameter set and version | ||
| * Reject if the protocol version is not supported | ||
| * Implementations MUST NOT infer or select an ML-DSA | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should implementations verify the length of the public key and/or signature based on the ML-DSA parameter set? Maybe this is already handled by the underlying library?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That should be managed by the underlying library, and I rather not see any TUF related code try to delve into analyzing the signature/pk size. I clarify this. |
||
| parameter set or version from the signature bytes alone -- | ||
| underlying crypto implementations should reject mismatched | ||
| signature/public key combinations | ||
| 4. Verifier must reconstruct the exact signed bytes itself | ||
| * It should not accept a caller-supplied digest/prefix blindly | ||
| * The `version` MUST be taken from the trusted TUF metadata's | ||
| `scheme` parameter | ||
| * The version binds hash function to use | ||
| * It should compute: `digest = H(MSG)` | ||
| * Then verify over `0x74 || 0x75 || 0x66 || version || digest` | ||
| using an empty context | ||
| 5. Reject unknown or mismatched versions | ||
| * Do not fall back | ||
| * Do not try multiple interpretations | ||
| * Do not accept the same signature under HashML-DSA or another scheme | ||
|
|
||
| # Security considerations | ||
|
|
||
| 1. SHA-512 length extension: Not a concern. The signature is made over `domain | ||
| || version || digest`. Length extension is more a concern when the | ||
| digest is computed to be used as a MAC | ||
| 2. SHA-512 vs ML-DSA-87 margin. SHA-512 has zero margin, but is | ||
| valid. Future versions can increase the margin if deemed necessary | ||
| 3. Signature confusion/replay: With the domain separation a valid | ||
| signature over the raw digest from another domain would not be | ||
| valid in the TUF metadata domain (collisions on the domain and | ||
| version bytes are _very_ unlikely) | ||
|
|
||
| # Appendix | ||
|
|
||
| ## Notes on application level hashing | ||
|
|
||
| From FIPS 204 on application level hashing (§5.4): | ||
|
|
||
| > In order to maintain the same level of security strength when the | ||
| > content is hashed at the application level or using HashML-DSA , the | ||
| > digest that is signed needs to be generated using an approved hash | ||
| > function or XOF (e.g., from FIPS 180 [8] or FIPS 202 [7]) that | ||
| > provides at least 𝜆 bits of classical security strength against both | ||
| > collision and second preimage attacks [7, Table 4]<sup>6</sup>. | ||
| > | ||
| > The verification of a signature that is created in this way will | ||
| > require the verify function to generate a digest from the message in | ||
| > the same way to be used as input for the verification function. | ||
| > | ||
| > 6. Obtaining at least 𝜆 bits of classical security strength against | ||
| > collision attacks requires that the digest to be signed be at least 2𝜆 | ||
| > bits in length. | ||
|
|
||
| # References | ||
|
|
||
| * [FIPS-204](https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.204.pdf) | ||
| * [TUF Specification](https://theupdateframework.github.io/specification/v1.0.34/index.html) | ||
| * [RFC 9881](https://datatracker.ietf.org/doc/html/rfc9881) | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was a bit confused until I got to the "TUF metadata parameters" section about where this fit with TUF. Maybe you could move that sub-section earlier