Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,9 @@ The report at `reports/snowflake-platform-assessment/` is a set of linked static
→ [docs/analysis/firmware-landscape-2026/README.md](docs/analysis/firmware-landscape-2026/README.md) — Hydroph0bia, LogoFAIL successors, UEFI cert expiry
→ [docs/analysis/apple-mie-impact.md](docs/analysis/apple-mie-impact.md) — Apple Memory Integrity Enforcement
→ [docs/analysis/vishing-2026-market.md](docs/analysis/vishing-2026-market.md) — deepfake vishing economics + healthcare targeting
→ [docs/analysis/snowflake-platform-attack-surface-2026.md](docs/analysis/snowflake-platform-attack-surface-2026.md) — CVE inventory, UNC5537 analysis, Cortex AI/Native Apps/SPCS attack surface, chains A–M (incl. Polaris/Iceberg K, OAuth scope drift L, UDF EAI breakout M), Trail vs ACCOUNT_USAGE field mapping
→ [docs/analysis/snowflake-platform-attack-surface-2026.md](docs/analysis/snowflake-platform-attack-surface-2026.md) — CVE inventory, UNC5537 analysis, Cortex AI/Native Apps/SPCS attack surface, chains A–M (incl. Polaris/Iceberg K, OAuth scope drift L, UDF EAI breakout M), Trail vs ACCOUNT_USAGE field mapping; chains carry maturity badges (EMPIRICAL / MODELED / HYPOTHESIS)
→ [docs/analysis/chain-reference-table.md](docs/analysis/chain-reference-table.md) — Canonical cross-reference: chain ↔ tool ↔ Sigma rule ID ↔ CVE ↔ PHI impact ↔ maturity
→ [docs/analysis/snowflake-cve-applicability-matrix-2026.md](docs/analysis/snowflake-cve-applicability-matrix-2026.md) — Per-CVE applicability: affected versions, required log level, dependent detection rules
→ [docs/analysis/snowflake-healthcare-overlay-2026.md](docs/analysis/snowflake-healthcare-overlay-2026.md) — Per-chain PHI exposure map + HIPAA control mapping + BAA considerations + OCR retention sufficiency
→ [docs/analysis/databricks-vs-snowflake-platform-comparison.md](docs/analysis/databricks-vs-snowflake-platform-comparison.md) — Cross-platform primitive map + chain mapping; detection-reuse notes for defenders covering both platforms
→ [detection/snowflake/README.md](detection/snowflake/README.md) — Cross-chain Sigma/KQL/SPL index, streaming ingest pattern, connector-debug-log secret-leak detector
Expand Down
20 changes: 20 additions & 0 deletions detection/snowflake/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,26 @@ Rules live next to the offensive PoCs they pair with (per the repo's
detection-pairing convention). This file is the cross-cutting view —
useful when building a SIEM rule set rather than evaluating one tool.

## Deployment readiness

Every Sigma rule in this pack carries a `maturity:` field naming what a
customer needs in place before the rule will fire. Honest accounting:

| Tag | Count | What it means for deployment |
|-----|------:|------------------------------|
| `production_ready` | 4 | Fires on raw audit / log surfaces a customer already ingests. No enrichment, correlation, or sidecar required. Drop in. |
| `requires_enrichment` | 19 | Fires only when a SIEM-side enrichment pipeline computes the derived fields listed under each rule's `enrichment.required`. See [`ENRICHMENT.md`](ENRICHMENT.md) for the full field contract; templates under [`enrichment-templates/`](enrichment-templates/). |
| `requires_correlation` | 4 | Fires only when an external audit stream — IdP sign-in events for `federated_login_anomaly` / `oauth_integration_scope_drift`, Cortex Code CLI session logs for `cortex_code_session_to_unknown_session` — is correlated with the Snowflake-side event. |
| `requires_cortex_sidecar` | 5 | Fires only when a Cortex Agents per-step trace is surfaced by a sidecar. Snowflake's first-party `ACCOUNT_USAGE` views do not surface the depth these rules require. |
| `requires_endpoint_telemetry` | 1 | Fires on host-side process / file telemetry, not Snowflake audit (Cortex Code CLI version-string detection). |

**Rule of thumb**: of the 33 Sigma rules in this pack, 4 work out of the
box. The remaining 29 land an alert only after the relevant enrichment,
correlation, or sidecar is operational. The `requires_enrichment` tier
is the biggest deployment lift; the [`enrichment-templates/`](enrichment-templates/)
directory has the SQL and SIEM lookup definitions to compute the derived
fields.

## Per-chain mapping

Every chain has both an ACCOUNT_USAGE-shaped rule (for the audit-table
Expand Down
67 changes: 67 additions & 0 deletions detection/snowflake/enrichment-templates/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Enrichment Templates — Snowflake Detection Pack

Concrete, copy-pasteable templates that produce the derived fields the
Sigma rules in this pack depend on. Without these, the
`requires_enrichment` and `requires_correlation` rules silently do not
fire — they are not SIEM syntax errors, they are deployment gaps.

The templates cover the three highest-value rules:

| Template directory | Rule it enables | Maturity | Why it's load-bearing |
|--------------------|------------------|----------|------------------------|
| [`bulk-exfil-baseline/`](bulk-exfil-baseline/) | [`sigma/bulk_exfil_baseline.yml`](../sigma/bulk_exfil_baseline.yml) | `requires_enrichment` | Chain A — UNC5537 replay. The single most replayed Snowflake attack pattern in the wild. |
| [`federated-login-anomaly/`](federated-login-anomaly/) | [`sigma/federated_login_anomaly.yml`](../sigma/federated_login_anomaly.yml) | `requires_correlation` | Chain D — federated-IdP compromise. Captures Golden SAML / Silver SAML class attacks the Snowflake side cannot prevent. |
| [`connector-secret-leak/`](connector-secret-leak/) | [`sigma/connector_secret_leak_in_logs.yml`](../sigma/connector_secret_leak_in_logs.yml) | `production_ready` | CVE-2025-27496 / CVE-2025-46329 class. Includes ingest-time redaction so the SIEM does not become the new long-retention repository for leaked master keys. |

Each subdirectory contains:

- `snowflake-side.sql` — the SQL run inside Snowflake that produces the
baseline / lookup table the SIEM consumes.
- `sentinel/` — Microsoft Sentinel artifacts: Watchlist schemas, KQL
enrichment functions, Logic-App or Data-Collector-API definitions.
- `splunk/` — Splunk artifacts: `lookup_definition.conf`,
`savedsearches.conf`, optional `props.conf` / `transforms.conf` for
ingest-time enrichment.
- `README.md` — operational notes including refresh cadence,
storage cost, and `[REQUIRES_TENANT]` markers for any value the
template cannot pre-fill.

## Pipeline shape (canonical)

```
Snowflake ACCOUNT_USAGE views
│ (15-min poll OR Snowflake Trail event stream)
SIEM ingest pipeline ── joins ──▶ Watchlists / Lookups
│ (role baselines, partner
│ registries, IdP audit)
Enriched event with derived fields
Sigma rule evaluates and emits an alert
```

The Snowflake side maintains the input tables on a nightly rebuild
cadence; the SIEM side hydrates the lookups daily and joins on event
ingest. If either side lags behind, the affected rule's false-negative
rate climbs silently.

## Refresh cadence guidance

| Input | Recommended refresh | Why |
|-------|---------------------|-----|
| `COPY_BYTES_P90_BY_ROLE` | Nightly | Captures legitimate variance day-to-day; week-old baselines miss seasonal shifts (quarter close, EHR refresh windows). |
| `APPROVED_EXFIL_STAGES` | On commit (config-as-code) | Treat as policy — changes should land through the same PR review as application configs. |
| `BULK_EXPORTER_ROLES` | On commit | Same. |
| `ROLE_BUSINESS_HOURS` | On commit | Same. |
| Partner registry | On commit | Same — keeping this stale is the most common cause of partner-integration false negatives. |
| IdP correlation watermark | Continuous | The IdP side's ingestion lag is what makes the `lag_tolerant` flag necessary; track the watermark. |

## See also

- [`ENRICHMENT.md`](../ENRICHMENT.md) — full inventory of every derived
field the rules in this pack reference.
- [`README.md`](../README.md) — pack-level overview and per-chain rule
map.
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Bulk Exfil Baseline — Enrichment Template

Drop-in enrichment for [`sigma/bulk_exfil_baseline.yml`](../../sigma/bulk_exfil_baseline.yml).

## Files

| Path | Purpose |
|------|---------|
| [`snowflake-side.sql`](snowflake-side.sql) | Creates the four input tables in Snowflake (`OPS.SECURITY.APPROVED_EXFIL_STAGES`, `BULK_EXPORTER_ROLES`, `ROLE_BUSINESS_HOURS`, `COPY_BYTES_P90_BY_ROLE`). Run nightly under a security-ops role. |
| [`sentinel/enrichment_function.kql`](sentinel/enrichment_function.kql) | Sentinel function `bulk_exfil_enriched()` that hydrates the derived fields the Sigma rule reads. |
| [`splunk/enrichment.conf`](splunk/enrichment.conf) | Splunk `transforms.conf` / `savedsearches.conf` / DB-Connect refresh stanzas. |

## Deployment order

1. **Snowflake side**: run [`snowflake-side.sql`](snowflake-side.sql) once
manually to seed the policy tables (`APPROVED_EXFIL_STAGES`,
`BULK_EXPORTER_ROLES`, `ROLE_BUSINESS_HOURS`). Replace the example
rows with the tenant's actual approved stages/roles/hours. Treat
future edits as config-as-code (PR review).
2. Schedule the `COPY_BYTES_P90_BY_ROLE` rebuild as a nightly Snowflake
Task running the relevant block of the SQL file.
3. **SIEM side**:
- **Sentinel**: upload the three Watchlists from the policy tables;
wire a Logic App to push the p90 table into the
`SF_CopyBytesP90ByRole_CL` custom log every 90 minutes; save
[`enrichment_function.kql`](sentinel/enrichment_function.kql)
under "Functions" with alias `bulk_exfil_enriched`. Point the
analytic rule corresponding to `sigma/bulk_exfil_baseline.yml`
at the function output.
- **Splunk**: copy the stanzas from [`enrichment.conf`](splunk/enrichment.conf)
into a `snowflake_detection` app under `local/`; deploy the four
CSVs into `lookups/`; enable the `daily_baseline_refresh` saved
search; enable the `bulk_exfil_enriched` saved search on its
scheduled cadence.

## Acceptance criteria

The rule is correctly enriched when every event emitted by
`bulk_exfil_enriched()` (Sentinel) or the `bulk_exfil_enriched` saved
search (Splunk) carries non-null values for all four derived fields:

- `external_stage_in_watchlist` ∈ {true, false}
- `role_in_approved_bulk_exporter_set` ∈ {true, false}
- `volume_above_role_baseline` ∈ {true, false}
- `outside_business_hours` ∈ {true, false}

If any field is null for >5% of events, the corresponding input table is
incomplete or stale. Check `OPS.SECURITY.COPY_BYTES_P90_FRESHNESS` for
the baseline; check Watchlist sync logs for the policy tables.

## Cost model

| Component | Cost order of magnitude |
|-----------|-------------------------|
| Nightly p90 rebuild (Snowflake) | One small warehouse run, ~1–5 credits depending on tenant size. |
| Watchlist storage (Sentinel) | <1 MB, $0/month. |
| Splunk lookup storage | <50 KB, negligible. |
| Logic App p90 push (90-min cadence) | Free tier covers it for most tenants. |

## `[REQUIRES_TENANT]` items

- `SECURITY_OPS_ROLE` / `SECURITY_OPS_WH` names in `snowflake-side.sql`
— replace with the tenant's actual ops role and warehouse.
- The example `APPROVED_EXFIL_STAGES`, `BULK_EXPORTER_ROLES`, and
`ROLE_BUSINESS_HOURS` rows are illustrative; **do not deploy without
security-ops review** of which external stages, roles, and hours
are legitimate for the tenant.
- The Splunk macro `snowflake_query_history` is a placeholder; point it
at the tenant's actual Snowflake ingest index/sourcetype.

## See also

- [`../../ENRICHMENT.md`](../../ENRICHMENT.md) — full enrichment-field
contract.
- [`../../streaming-ingest/`](../../streaming-ingest/) — the
upstream ingestion pipeline that produces the events this enrichment
hydrates.
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
// Sentinel enrichment function for sigma/bulk_exfil_baseline.yml.
//
// Materializes a callable function bulk_exfil_enriched() that returns
// Snowflake QUERY_HISTORY events with the four derived fields the Sigma
// rule requires:
//
// external_stage_in_watchlist : bool
// role_in_approved_bulk_exporter_set : bool
// volume_above_role_baseline : bool
// outside_business_hours : bool
//
// Prerequisites:
//
// 1) Snowflake_ACCOUNT_USAGE_QUERY_HISTORY_CL is ingested via the
// streaming-ingest pipeline (see detection/snowflake/streaming-ingest/
// for a Function-App + Event-Hubs reference implementation).
// 2) Three Watchlists are uploaded into Sentinel:
//
// Watchlist name Snowflake source table
// ───────────────────────────────── ──────────────────────────────────────
// SF_ApprovedExfilStages OPS.SECURITY.APPROVED_EXFIL_STAGES
// SF_BulkExporterRoles OPS.SECURITY.BULK_EXPORTER_ROLES
// SF_RoleBusinessHours OPS.SECURITY.ROLE_BUSINESS_HOURS
//
// The CopyBytesP90ByRole table is queried directly from Snowflake on a
// 90-min cadence via a Logic App that writes into a custom log
// SF_CopyBytesP90ByRole_CL. This avoids a daily watchlist churn that
// would tax Sentinel.
//
// 3) Save this function under Sentinel "Functions" with the alias
// `bulk_exfil_enriched`. Schedule the analytic rule from
// sigma/bulk_exfil_baseline.yml against the function output.

let WATCHLIST_STAGES = _GetWatchlist('SF_ApprovedExfilStages') | project stage_url_prefix;
let WATCHLIST_ROLES = _GetWatchlist('SF_BulkExporterRoles') | project role_name;
let WATCHLIST_HOURS = _GetWatchlist('SF_RoleBusinessHours')
| project role_name, tz, start_hour=toint(start_hour), end_hour=toint(end_hour);
let BASELINE_P90 = SF_CopyBytesP90ByRole_CL
| summarize p90_bytes = arg_max(TimeGenerated, *) by role_name_s
| project role_name = role_name_s,
p90_bytes = todouble(p90_bytes_d);
let parse_stage_prefix = (qt:string) {
// Capture the @<stage> or s3://... prefix from COPY INTO.
// Strip the trailing path segment so it matches the watchlist prefix.
extract(@"COPY\s+INTO\s+(@?[A-Za-z0-9_\.\-/]+:?/?/?[^/\s]+(?:/[^/\s]+)*)", 1, qt)
};
let in_business_hours = (event_time:datetime, role:string, tz:string,
start_hour:int, end_hour:int) {
let local_dt = datetime_part('hour', event_time + 0h /* assume UTC ingest; tz-aware
conversion done in the
ingest pipeline below */);
iff(start_hour <= end_hour,
local_dt >= start_hour and local_dt < end_hour,
local_dt >= start_hour or local_dt < end_hour)
};
Snowflake_ACCOUNT_USAGE_QUERY_HISTORY_CL
| where query_type_s == "COPY"
| where query_text_s has "COPY INTO @" or query_text_s has "COPY INTO 's3://"
| extend stage_prefix = parse_stage_prefix(query_text_s)
| extend role_name = role_name_s
| extend event_time = todatetime(start_time_t)
| extend bytes_written = tolong(bytes_written_to_result_d)
| join kind=leftouter (WATCHLIST_HOURS) on role_name
| join kind=leftouter (BASELINE_P90) on role_name
| extend external_stage_in_watchlist =
iff(isempty(stage_prefix), false,
toscalar(WATCHLIST_STAGES | where stage_url_prefix == stage_prefix | count) > 0)
| extend role_in_approved_bulk_exporter_set =
toscalar(WATCHLIST_ROLES | where role_name == role_name | count) > 0
| extend volume_above_role_baseline =
isnotnull(p90_bytes) and bytes_written > p90_bytes
| extend outside_business_hours =
not(in_business_hours(event_time, role_name, tz, start_hour, end_hour))
| project event_time, user_name_s, role_name, session_id_s, query_text_s,
bytes_written, stage_prefix,
external_stage_in_watchlist,
role_in_approved_bulk_exporter_set,
volume_above_role_baseline,
outside_business_hours
Loading
Loading