Skip to content

feat: SNP C NIF refactor and SSL certificate device#659

Open
PeterFarber wants to merge 66 commits intopermaweb:edgefrom
PeterFarber:feat/snp-c-nif-and-ssl-cert-device
Open

feat: SNP C NIF refactor and SSL certificate device#659
PeterFarber wants to merge 66 commits intopermaweb:edgefrom
PeterFarber:feat/snp-c-nif-and-ssl-cert-device

Conversation

@PeterFarber
Copy link

1. SNP NIF refactor (main PR)

Converts the AMD SEV-SNP attestation stack from a Rust NIF to a C NIF and shifts most behavior into Erlang.

  • NIF: native/dev_snp_nif (Rust/Rustler) is removed; a new C NIF in native/snp_nif/snp_nif.c provides the minimal native layer (e.g. report generation). Rust NIF deps (e.g. Cargo.lock, dev_snp_nif crate) are dropped.
  • Erlang: Verification, certificate handling, launch digest, message handling, trust and policy checks, nonce handling, and related logic live in new/updated Erlang modules (snp_verification, snp_generate, snp_certificates, snp_launch_digest*, snp_message, snp_trust, snp_nonce, etc.). dev_snp becomes a thin wrapper that delegates to these modules.
  • Security & correctness: Includes hardening done during the refactor: constant-time nonce comparison, safe hex decoding, no cert caching (avoids ETS poisoning), vcpus range validation, no sensitive data in SNP events, proper KDS/NIF error handling, and use of hb_ao:get for report/message fields.
  • Build/config: OVMF comes from a single path under priv/ovmf and is copied at build time; optional CPU family for VCEK download; rebar and test updates for the new layout (e.g. test/OVMF-1.55.fd → repo root if applicable).

2. Merge: feat/ssl-cert-device (2da3df3)

This branch also includes changes merged from origin/feat/ssl-cert-device into feat/ssl_test (merge commit 2da3df3):

  • SSL certificate device: New dev_ssl_cert.erl module for automated Let's Encrypt certificate management using ACME v2 and DNS-01 challenges. Provides HTTP endpoints for requesting, managing, and renewing SSL certificates; supports staging and production; handles DNS challenge generation and validation (manual DNS TXT record setup).
  • Green zone & HTTP stack: Major updates to dev_green_zone.erl, hb_http_server.erl, and hb_http_client.erl to integrate the SSL cert device and related HTTP/TLS behavior.
  • Config: rebar.config / rebar.lock updated (e.g. ssl_cert dependency); hb_opts.erl and erlang_ls.config adjusted for the new layout.

Files touched in the merge: erlang_ls.config, rebar.config, rebar.lock, src/dev_green_zone.erl, src/dev_ssl_cert.erl (new), src/hb_http_client.erl, src/hb_http_server.erl, src/hb_opts.erl.

PeterFarber and others added 30 commits September 8, 2025 12:57
Add complete SSL certificate management system for HyperBEAM:

* dev_ssl_cert device - HTTP API for certificate lifecycle management
* hb_acme_client - ACME v2 protocol implementation with Let's Encrypt
* hb_ssl_cert_tests - 24 comprehensive tests with structured logging
* DNS-01 challenge support for manual TXT record setup
* Enhanced error reporting with detailed ACME diagnostics
* Works with any DNS provider, staging/production environments
- Replace hb_ao parameter extraction with hb_opts configuration
- Update all API endpoints to use ssl_cert_request_id config
- Add enhanced error reporting and timeout configuration
- Update tests to match new configuration-driven approach
Major refactor improving code organization and maintainability:

SSL Certificate Device:
- Extract monolithic functions into focused helpers
- Leverage ssl_cert library functions for validation/operations
- Add comprehensive documentation and fix pattern matching warnings
- Organize with public API at top, internal helpers at bottom

HTTP Server:
- Reorganize functions by functionality with clear sections
- Add module constants for hardcoded values (ports, timeouts, paths)
- Eliminate duplicate code with shared utility functions
- Add type specifications and comprehensive documentation
- Standardize error handling and improve function naming

Key benefits:
- Better maintainability through focused, single-purpose functions
- Increased code reuse by leveraging existing libraries
- Production-ready code following Erlang best practices
- Remove complex redirect handling logic that was causing failures
- Simplify gun_req function to match old working version
- Remove MaxRedirects and redirects_left tracking
- Add parse_peer function for simpler peer URL parsing
- Use port-based transport detection instead of scheme-based
- Remove handle_redirect function and complex redirect following

This fixes scheduler test failures where redirects were not being
handled correctly.
- Add get_cert/3 and request_cert/3 endpoints to dev_ssl_cert for secure
  certificate sharing between green zone nodes using AES-256-GCM encryption
- Extract encryption/decryption logic into reusable helper functions in
  dev_green_zone (encrypt_data/2, decrypt_data/3)
- Refactor existing green zone code to use centralized crypto helpers
- Update hb_http_server to support configurable HTTPS ports and fix
  protocol field (https -> http2) for proper HTTP version semantics
- Improve certificate file handling with automatic directory creation
- Use modern Erlang 'maybe' expressions for cleaner error handling
- Add comprehensive API documentation and usage examples

Breaking changes:
- start_https_node/4 -> start_https_node/5 (added HttpsPort parameter)
- redirect_to_https/2 -> redirect_to_https/3 (added HttpsPort parameter)
- Certificate files now stored in configurable 'certs' directory
PeterFarber and others added 24 commits December 12, 2025 12:34
Combined test refactors, SNP work, and related changes from ed87363 through 807ed71.

Co-authored-by: Cursor <cursoragent@cursor.com>
- snp_message: stop merging Report into Msg; return 6-tuple with Report so
  verification uses only message (and NodeOpts) for trust/measurement and
  report for policy/measurement (removes trust bypass).
- snp_verification: use decoded Report for verify_debug_disabled and for
  actual measurement in mismatch events; read policy with maps:get; add
  policy_to_integer (int/float/binary); treat missing policy as debug
  enabled; log policy_raw and policy_int in events.
- chore: add snp_short events for verify pipeline (extract_ok, nonce,
  signature, debug_disabled, trusted_software, measurement, report_integrity,
  snp_verify_done) and for message normalization (msg keys, report_not_merged).
snp_message: ?event(snp_temp, {snp_message_normalized, ...}) after normalizing Msg.
snp_verification: ?event(snp_temp, {snp_verify_step, ...}) and snp_verify_done. Grep snp_temp to find; revert to snp_short when done debugging.
snp_verification: policy and measurement from report use hb_ao:get(Key, Map, Default, #{}) so Opts is never undefined (fixes {badmap, undefined} in verify_measurement).
snp_message: validate_report_field and validate_address_field use hb_ao:get(..., undefined, #{}) instead of maps:get for consistency.
…_GUEST_POLICY_DEBUG

snp_constants.hrl: add SNP_GUEST_POLICY_DEBUG mask (1 bsl 19); comment that policy.DEBUG is authoritative, not TCB/SVN, and report must be verified first.
snp_verification: use ?SNP_GUEST_POLICY_DEBUG in verify_debug_disabled and is_debug; doc that we use guest policy only and report is verified in same pipeline.
…loaded when NIF missing

snp_generate: remove get(mock_snp_nif_enabled) and generate_mock_report() fallback on nif_error; always call NIF; on {nif_error,_} return {error, nif_not_loaded}. Remove generate_mock_report/0.
dev_snp_test: generate tests that required mock now skip with {skip, "SNP NIF not loaded"} when generate returns {error, nif_not_loaded}; remove mock_snp_nif/unmock_snp_nif from config-only tests (missing wallet, missing trusted).
…c ETS cache poisoning)

snp_certificates: remove ETS cert chain and VCEK caches; remove clear_cache/0, clear_cert_chain_cache/0, clear_vcek_cache/0 and all cache helpers; fetch_cert_chain/1 and fetch_vcek/6 always perform network requests.
…ers check and fail

snp_util: hex_to_binary/1 returns {ok, binary()} | {error, invalid_hex}; no zero-filled return on invalid/odd-length hex.
snp_launch_digest_sev_hashes: construct_sev_hashes_page_erlang returns {ok, page} | {error, invalid_hex}; update_sev_hashes_table returns {ok, gctx} | {error, invalid_hex}; hash_to_binary/1 helper.
snp_launch_digest_ovmf: case update_sev_hashes_table/construct_sev_hashes_page_erlang and erlang:error(invalid_hex) on error.
snp_launch_digest: initialize_gctx_from_firmware cases on hex_to_binary, errors on invalid_hex.
…ers check and fail

snp_util: hex_to_binary/1 returns {ok, binary()} | {error, invalid_hex}; no zero-filled return on invalid/odd-length hex.
snp_launch_digest_sev_hashes: construct_sev_hashes_page_erlang returns {ok, page} | {error, invalid_hex}; update_sev_hashes_table returns {ok, gctx} | {error, invalid_hex}; hash_to_binary/1 helper.
snp_launch_digest_ovmf: case update_sev_hashes_table/construct_sev_hashes_page_erlang and erlang:error(invalid_hex) on error.
snp_launch_digest: initialize_gctx_from_firmware cases on hex_to_binary, errors on invalid_hex.
snp_nonce: report_data_matches/3 uses constant_time_eq/2 (XOR then fold OR) instead of ==; constant_time_eq/2 same size only, no short-circuit.
snp_constants.hrl: add MAX_VCPUS (512).
snp_launch_digest: validate vcpus after extract_launch_digest_params; error({invalid_vcpus, V}) if not integer or not in 1..?MAX_VCPUS.
snp_launch_digest_gctx: same check at start of update_with_vmsa_pages/4 (defense in depth).
…tch failure

snp_certificates: fetch_verification_certificates/6 cases on fetch_cert_chain and fetch_vcek; returns {ok, {CertChainPEM, VcekDER}} on success, {error, Reason} when either fetch fails; spec updated.
snp_verification: verify_report_integrity/2 cases on fetch_verification_certificates; on {error, Reason} returns {error, Reason} so verification returns a clean error instead of crashing.
…t_hook)

rebar.config: post_hook compile copies OVMF-1.55.fd from project root to priv/ovmf/OVMF-1.55.fd.
snp_launch_digest_ovmf: use single path code:priv_dir(hb)/ovmf/OVMF-1.55.fd; file:read_file_info then parse_ovmf_and_update or fallback. No path list, no sanitization (path is build-time fixed).
snp_verification, snp_message, snp_generate, snp_trust: removed ?event calls that logged full messages, report, hashes, measurement hex, nonce, address, signers, local-hashes, trusted config, etc. Kept non-sensitive events (success/failure, sizes only where useful).
snp_util: hex_to_binary_invalid_input logs hex_size only, not raw input.
Align with snp_launch_digest_ovmf: read OVMF from
code:priv_dir(hb)/ovmf/OVMF-1.55.fd (build-time copy) with fallback
to repo root OVMF-1.55.fd for dev. Remove test/ and hardcoded
/root paths.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants