Add support for compute runtime quote policy#6474
Add support for compute runtime quote policy#6474martintomazic wants to merge 9 commits intomasterfrom
Conversation
✅ Deploy Preview for oasisprotocol-oasis-core canceled.
|
d0c991e to
859e7a8
Compare
| - .buildkite/scripts/test_e2e.sh --timeout 20m | ||
| --scenario e2e/runtime/runtime-encryption | ||
| --scenario e2e/runtime/compute-policy | ||
|
|
There was a problem hiding this comment.
e2e test with mocked TEE was instrumental to realizing that without compute_policy on the rust side of the protocol, RHP would fail due to the missing field.
To test that compute policy is applied e2e you can disable pcs in the quote policy (fixture) and run:
export OASIS_UNSAFE_ALLOW_DEBUG_ENCLAVES=1
export OASIS_UNSAFE_MOCK_TEE=1
export OASIS_TEE_HARDWARE=intel-sgx
make
.buildkite/scripts/test_e2e.sh --scenario e2e/runtime/compute-policy
The test would gets stuck (as expected).
Full SGX test would be desired here (easy). It would be also nice to add another one, where we e.g. set MinTCBEvaluationDataNumber and verify via the control status that the registration failed... Still not optimal as technically we would catch node local attestation to halt early and not consensus rejecting byzantine compute.
Testing that rofl nodes pass on the lower TCB but compute node is rejected would be even harder as we would need two different SGX environments to make it work.
What do you think?
There was a problem hiding this comment.
Yes, it would be great if we can test it in real SGX.
For the two different environments, let's open a separate issue to test that E2E.
There was a problem hiding this comment.
Yes, it would be great if we can test it in real SGX.
Done 9675e14 :).
I had to comment UNSAFE_SKIP_AVR_VERIFY which caused most of the policy verification to be ignored (see) and set a more relaxed verification because our runner has outdated TCB.
My suggestion is to add another env variable: OASIS_UNSAFE_SKIP_LOCAL_VERIFY, which skips local verification prior to submitting attestation via transaction. Then I can verify from the node control status consensus rejected attestation for the right reason. The test is currently stopped at the local verification which is not ideal given this is critical code path that we want to have tested.
{"caller":"service.go:122","err":"quote verification failed (fresh bundle): pcs/quote: failed to verify TCB bundle: pcs/tcb: failed to verify TCB info: pcs/tcb: invalid TCB info: pcs/tcb: FMSPC is not whitelisted","level":"warn","module":"common/sgx/pcs/cqs","msg":"error verifying downloaded TCB refresh","tcb_evaluation_data_number":21,"ts":"2026-03-30T14:04:50.6645918Z"}Optional:
- I have happy and unhappy path. If we whitelist
InvalidFMSCPunder the default policy and the one the current runner is using under compute policy, this will also verify FMSCP whitelist e2e. - We can also play with
MinTCBEvaluationDataNumber, possibly downgrade it to make current runner pass withoutLAX_VERIFY.
There was a problem hiding this comment.
OASIS_UNSAFE_SKIP_LOCAL_VERIFY: technically there is nothing unsafe about it as this is local sanity check prior to trying to register.
There was a problem hiding this comment.
My suggestion is to add another env variable: OASIS_UNSAFE_SKIP_LOCAL_VERIFY, which skips local verification prior to submitting attestation via transaction.
We also have RHP, which currently passes with empty default policy (used there) and relaxed TCB level. So this flag will not be that useful for other tests in the future, unless we rely on the default/compute policy.
In practice this variable should gate:
oasis-core/go/common/sgx/pcs/service.go
Line 118 in 59b8f32
oasis-core/runtime/src/identity.rs
Line 273 in 59b8f32
It cannot live inside those functions as it should only apply for local verification, but should NOT pass remote verification, unless specified by other environment variables. The easiest way would be to add another parameter to both verify functions specifying if verification should be ignored, in addition to internal global state flags (e.g. see) that currently define whether verification should be ignored/relaxed regardless whether this is local or remote/consensus verification.
There was a problem hiding this comment.
I don't like this additional variable as it seems it would introduce additional complexity around checks. I would skip it for now.
There was a problem hiding this comment.
Removed this for now as I agree this adds too much complexity.
One option would be to only use default policy during local validation, like is currently done for RHP, as consensus would catch wrong attestation anyways. This would simplify our code further (removing the need for ff833fc) and enable us to keep previous unhappy test. On the other hand this may cause honest nodes to post invalid registration transactions. Also not most robust/maintainable in the long run but it achieves the goal.
If anyone sees a clean way to test unhappy path is rejected, I am willing to implement it here.
859e7a8 to
b795a2e
Compare
| /// The compute runtime quote policy. | ||
| #[cbor(optional)] | ||
| compute_policy: sgx::QuotePolicy, |
There was a problem hiding this comment.
This field is ignored for now and only here to pass e2e test.
In practice we probably need new RHP method (similar to node identity), to signal whether compute policy should be respected or not. Technically not breaking?
The runtime should sign with its RAK whether the intention of this enclave is to be used as the one registering on the consensus (having access to keymanager) or is the intention to be used in the less constrained environments (default policy only). Similar solution to what is done for the node_id, i.e. make it part of the hash structure and sign with RAK and validate signature on the consensus side.
Do we really need this / what are possible attack vectors without it?
There was a problem hiding this comment.
I would avoid this extra verification for now as I don't see what attack this would prevent. It is other runtimes and the consensus layer that are enforcing this policy.
There was a problem hiding this comment.
Agree as long as we do 2 side verification on both enclave_rpc/consensus side (confirmed myself again) exploiting this is not possible. Moreover, currently the keymanager policy must be respected on both sides when interacting with it.
If however, at an point we change keymanager-RONL interaction, so that RONL must only satisfy its policy when querying keymanager for its secrets, we should be careful, not to accidentally use the default one, especially if we introduce remote RONLs. The problem there would be if the ROFL would want stricter policy then the default RONL, but again this would be mitigated by the enclave rpc using compute policy.
Having ROFL requiring stricter policy than the keymanager/compute nodes wouldn't make sense given it depends on them directly, so ignoring this corner case.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #6474 +/- ##
==========================================
- Coverage 64.65% 63.22% -1.44%
==========================================
Files 698 698
Lines 68233 68294 +61
==========================================
- Hits 44118 43179 -939
- Misses 19105 20141 +1036
+ Partials 5010 4974 -36 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| /// The compute runtime quote policy. | ||
| #[cbor(optional)] | ||
| compute_policy: sgx::QuotePolicy, |
There was a problem hiding this comment.
I would avoid this extra verification for now as I don't see what attack this would prevent. It is other runtimes and the consensus layer that are enforcing this policy.
d217ddf to
9675e14
Compare
| - .buildkite/scripts/test_e2e.sh --timeout 20m | ||
| --scenario e2e/runtime/runtime-encryption | ||
| --scenario e2e/runtime/compute-policy | ||
|
|
There was a problem hiding this comment.
I don't like this additional variable as it seems it would introduce additional complexity around checks. I would skip it for now.
b37e6f3 to
7466948
Compare
a1fa33f to
b2b7a01
Compare
7466948 to
6f3cdaf
Compare
6f3cdaf to
cd10429
Compare
Replaces #6471 closes #6387.