Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
111 changes: 97 additions & 14 deletions detection/snowflake/ENRICHMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -180,17 +180,88 @@ to avoid false positives during the slower system's ingestion delay.
- **Rules:**
[`federated_login_anomaly.yml`](sigma/federated_login_anomaly.yml)
(and Trail-paired variants where present).
- **Computation:** `lag_tolerant` is a boolean ingestion-pipeline
parameter (default `true`). When true, the rule defers final firing
until *both* sides of the correlation have known-ingested timestamps
newer than the event time, surfaced as `both_sources_caught_up = true`.
The per-source watermarks (`idp_audit_watermark_ingested_at`,
`snowflake_login_ingested_at`) come from the SIEM pipeline's
per-source ingestion tracker.
- **Input data location:** Computed by the SIEM ingestion pipeline
from per-source ingestion timestamps. Most modern SIEMs (Sentinel,
Splunk SC4S, Elastic Fleet) expose ingestion watermarks per source;
the pipeline only needs to materialize them as enrichment columns.
- **Why these fields exist:** the federated-login rule fires on the
*absence* of a corresponding IdP sign-in event. Without a known
watermark per source, "absent" cannot be distinguished from "delayed."
A rule that fires on "absent so far" produces a false-positive storm
during routine IdP audit ingestion lag and is the first detection a
SOC will disable. The watermark machinery is what makes the rule
deployable.

**Definition.** A *per-source watermark* is the latest event-time of
any record successfully ingested from that source — measured in the
event's own timestamp domain, not the SIEM's wall clock. The rule
treats an event as "comparable against the other source" only when
both sources' watermarks are at or after the event's timestamp plus the
configured `idp_correlation_window_minutes`. Until then, the rule
suppresses firing and re-evaluates on the next ingestion tick.

**Computation by SIEM.** The watermark is a max-aggregation per source
over a recent window. Concrete forms:

- **Microsoft Sentinel (KQL):**
```kusto
let idp_audit_watermark = toscalar(
OktaSystemLog
| where TimeGenerated > ago(2h)
| summarize max(EventTime)
);
let snowflake_login_watermark = toscalar(
SnowflakeLoginHistory
| where TimeGenerated > ago(2h)
| summarize max(EventTimestamp)
);
```
Materialize as enrichment columns on each event being evaluated;
`both_sources_caught_up = (event_time + window_minutes) <= least_of_both_watermarks`.
- **Splunk (SPL):** Use a `tstats latest(_time)` lookup per
sourcetype, indexed every 60s into a `summary` kvstore; join the
summary into each event at search time. `_indextime` (not `_time`)
is the canonical ingest-time field if event-time is unreliable.
- **Elastic / Logstash:** Use the `ingest_time` field added by the
Fleet pipeline; aggregate via a transform that computes
`max(ingest_time)` per `data_stream.dataset`. Surface as a
`runtime_field` on the detection index.

**Where input data lives.** The watermarks are computed from the
ingestion pipeline's own state, not from external watchlists. Concretely
the inputs are: (a) for the IdP side, the Okta System Log connector's
`EventTime` column or the Entra Sign-In Logs connector's
`SignInActivityTimestamp` column; (b) for the Snowflake side, the
`LOGIN_HISTORY` event timestamp on the ACCOUNT_USAGE path, or the
Trail `auth.snowflake.login` event time on the Trail path. The
materialized watermarks should be reachable as fields on each event
the SIEM evaluates — either via a per-event lookup join or via a
runtime field that re-computes on each search.

**Fallback when watermarks are not yet wired up.** A SIEM without the
watermark machinery has two safe operating modes:

- **Conservative (recommended).** Set `lag_tolerant: false` on the
rule body and tune `idp_correlation_window_minutes` to absorb the
worst-case combined ingestion SLA of both sides (e.g., 60 minutes
if Okta SLA is 10m, ACCOUNT_USAGE latency is 45m, and the IdP-to-
Snowflake correlation window is 5m). The rule degrades to a fixed-
window correlation that fires when no IdP event has arrived after
the window has elapsed. False-positive risk is bounded by the SLA.
- **Permissive (not recommended without compensating controls).**
Pin `both_sources_caught_up: true` unconditionally. The rule fires
on apparent absence regardless of ingestion state and will produce
spurious alerts during routine ingestion lag. Use this only as a
short-term posture while the watermark pipeline is being built,
and only with explicit SOC sign-off; the noise will train operators
to ignore the rule.

**Validation before promoting the rule to alert.** Submit a synthetic
test event with `has_corresponding_idp_event: false` and an
event-timestamp 24 hours in the future. The rule MUST NOT fire — if it
does, the watermark logic is not wired up (the rule is evaluating the
future event as if both sources had ingested up to it, which they have
not). A separate synthetic with an event-timestamp two minutes in the
past, no matching IdP event, and `lag_tolerant: true` should fire only
after both real ingestion watermarks have advanced past the event time;
observe the firing delay equals the slower source's lag, not the rule
evaluator's cadence.

### `has_cortex_code_session_within_window`, `cortex_code_session_host_id`

Expand Down Expand Up @@ -433,7 +504,7 @@ trace once captured:

## 11. Chain-M (UDF EAI) Derived Fields

### `udf_owner`, `udf_eai_list`, `eai_network_rule_value_list`, `eai_rule_is_overbroad`, `invocation_role_eq_owner`
### `udf_owner`, `udf_has_eai`, `udf_eai_list`, `eai_network_rule_value_list`, `eai_rule_is_overbroad`, `invocation_role_eq_owner`

- **Rule:** [`udf_with_eai_invocation.yml`](../../tools/lateral-movement/snowflake-pivot/detection/sigma/udf_with_eai_invocation.yml).
- **Native source:** `ACCOUNT_USAGE.FUNCTIONS` (function owner + EAI
Expand All @@ -442,8 +513,14 @@ trace once captured:
user/role).
- **Computation:** Join at ingest. `eai_rule_is_overbroad` is `true`
when the referenced NETWORK RULE's `value_list` contains a wildcard
(`*`, `OPEN_ANY`).
- **Input data location:** All four joins are pure Snowflake views.
(`*`, `OPEN_ANY`). `udf_has_eai` is the explicit boolean cardinality
flag — `true` iff `FUNCTIONS.EXTERNAL_ACCESS_INTEGRATIONS` is non-null
AND parses to a non-empty list. The rule keys on `udf_has_eai: true`
rather than `udf_eai_list|exists: true` because Sigma's `|exists`
modifier checks field presence, not list cardinality — an empty list
serialized as `[]` would pass `|exists` and fire the rule on UDFs
that declare *no* EAIs.
- **Input data location:** All five joins are pure Snowflake views.

## 12. SPCS Image Posture Derived Fields

Expand All @@ -469,6 +546,12 @@ confirm:
`idp_correlation_window_minutes` is tuned to cover the worst-case
ingestion lag of either side, and `lag_tolerant` is enabled by
default.
- [ ] Per-source ingestion watermarks (`idp_audit_watermark_ingested_at`,
`snowflake_login_ingested_at`) are materialized as enrichment columns
using the SIEM-specific recipe in &sect;3 above; the synthetic-event
validation (future-dated event MUST NOT fire) has been run and passed.
Without this, `federated_login_anomaly.yml` silently never fires and
the customer has a Chain D detection gap they will not notice.
- [ ] If Snowflake Trail is enabled for the account, the Cortex Trail
event families are emitting; if not, a Cortex telemetry sidecar
is deployed (Snowpark wrapper) and the rule registry reflects
Expand Down
8 changes: 4 additions & 4 deletions detection/snowflake/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,13 @@ customer needs in place before the rule will fire. Honest accounting:
| Tag | Count | What it means for deployment |
|-----|------:|------------------------------|
| `production_ready` | 4 | Fires on raw audit / log surfaces a customer already ingests. No enrichment, correlation, or sidecar required. Drop in. |
| `requires_enrichment` | 19 | Fires only when a SIEM-side enrichment pipeline computes the derived fields listed under each rule's `enrichment.required`. See [`ENRICHMENT.md`](ENRICHMENT.md) for the full field contract; templates under [`enrichment-templates/`](enrichment-templates/). |
| `requires_enrichment` | 20 | Fires only when a SIEM-side enrichment pipeline computes the derived fields listed under each rule's `enrichment.required`. See [`ENRICHMENT.md`](ENRICHMENT.md) for the full field contract; templates under [`enrichment-templates/`](enrichment-templates/). |
| `requires_correlation` | 4 | Fires only when an external audit stream — IdP sign-in events for `federated_login_anomaly` / `oauth_integration_scope_drift`, Cortex Code CLI session logs for `cortex_code_session_to_unknown_session` — is correlated with the Snowflake-side event. |
| `requires_cortex_sidecar` | 5 | Fires only when a Cortex Agents per-step trace is surfaced by a sidecar. Snowflake's first-party `ACCOUNT_USAGE` views do not surface the depth these rules require. |
| `requires_endpoint_telemetry` | 1 | Fires on host-side process / file telemetry, not Snowflake audit (Cortex Code CLI version-string detection). |

**Rule of thumb**: of the 33 Sigma rules in this pack, 4 work out of the
box. The remaining 29 land an alert only after the relevant enrichment,
**Rule of thumb**: of the 34 Sigma rules in this pack, 4 work out of the
box. The remaining 30 land an alert only after the relevant enrichment,
correlation, or sidecar is operational. The `requires_enrichment` tier
is the biggest deployment lift; the [`enrichment-templates/`](enrichment-templates/)
directory has the SQL and SIEM lookup definitions to compute the derived
Expand Down Expand Up @@ -66,7 +66,7 @@ ingestion surface available on the customer's side.

| Chain | What it does | ACCOUNT_USAGE Sigma | Trail Sigma |
|-------|--------------|---------------------|-------------|
| A — Credential theft to bulk exfil | UNC5537 replay; bulk `COPY INTO @stage` from a non-MFA / no-network-policy user. | [`bulk_exfil_baseline.yml`](sigma/bulk_exfil_baseline.yml) + bind-param coverage: [`snowflake_bind_param_audit_gap.yml`](../../tools/lateral-movement/snowflake-pivot/detection/sigma/snowflake_bind_param_audit_gap.yml) | — (folded into bulk_exfil_baseline via the streaming-ingest pipeline) |
| A — Credential theft to bulk exfil | UNC5537 replay; bulk `COPY INTO @stage` from a non-MFA / no-network-policy user. | [`bulk_exfil_baseline.yml`](sigma/bulk_exfil_baseline.yml) + bind-param coverage: [`snowflake_bind_param_audit_gap.yml`](../../tools/lateral-movement/snowflake-pivot/detection/sigma/snowflake_bind_param_audit_gap.yml) | [`bulk_exfil_baseline_trail.yml`](sigma/bulk_exfil_baseline_trail.yml) — mirrors the ACCOUNT_USAGE rule's four-signal contract on `query.snowflake.completed`; where Trail is not yet wired up, the streaming-ingest sidecar under [`streaming-ingest/`](streaming-ingest/) is the interim minute-scale coverage |
| B — Cortex Code indirect injection | Pre-1.0.25 Cortex Code CLI executes shell-pipe-sh under indirect prompt injection. | [`cortex_code_pre_1_0_25.yml`](sigma/cortex_code_pre_1_0_25.yml) (version-string, endpoint-side) + behavioral pair: [`cortex_code_session_to_unknown_session.yml`](sigma/cortex_code_session_to_unknown_session.yml) | covered by the behavioral pair (does not depend on Trail event names) |
| C — Native App Marketplace supply-chain | Installed Native App auto-updates to a manifest with new external integrations, new privileges, or new/mutated dependencies (incl. deferred-loader shape). | [`native_app_unexpected_version_bump.yml`](sigma/native_app_unexpected_version_bump.yml) + [`native_app_privilege_bump.yml`](../../tools/supply-chain/snowflake-native-app/detection/sigma/native_app_privilege_bump.yml) + [`native_app_dependency_drift.yml`](../../tools/supply-chain/snowflake-native-app/detection/sigma/native_app_dependency_drift.yml) | [`native_app_privilege_bump_trail.yml`](../../tools/supply-chain/snowflake-native-app/detection/sigma/native_app_privilege_bump_trail.yml) |
| D — Federated-IdP compromise | Forged SAML/OAuth assertion authenticates a high-privileged Snowflake user. | [`federated_login_anomaly.yml`](sigma/federated_login_anomaly.yml) | — (use the Chain F Trail variant; same login_history shape) |
Expand Down
118 changes: 118 additions & 0 deletions detection/snowflake/sigma/bulk_exfil_baseline_trail.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
title: Snowflake Trail — Bulk COPY INTO External Stage (Chain A, role-aware)
id: 9a1c3e5f-7b2d-4e6a-9c0e-1f2a3b4c5d6e
maturity: requires_enrichment # fires only when a SIEM-side enrichment pipeline computes the derived fields listed under enrichment.required
status: experimental
description: |
Trail-event-shaped pair to `bulk_exfil_baseline.yml`. Consumes the
`query.snowflake.completed` Trail event for any `COPY INTO @<external_stage>`
whose combination of signals separates an attacker's first-and-only
bulk exfil from a legitimate role's recurring data motion.

The detection contract mirrors the ACCOUNT_USAGE-shaped rule exactly —
same four-signal gating, same false-positive guidance — but consumes
the real-time Trail event stream instead of polling
`SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY`. Latency advantage: seconds
rather than the up-to-45-minute ACCOUNT_USAGE projection lag. This is
the difference between catching an UNC5537-class exfil during the
attacker's session vs. after it has already completed.

The rule fires when **all four** of the following hold:

1. The query is a `COPY INTO @<external_stage>` (external-stage form,
not internal stage).
2. The external stage is **not** on the customer-maintained
approved-exfil-stage watchlist.
3. **At least one** of:
- the user's role is not in the approved bulk-exporter set,
- the volume exceeds the role's 90th-percentile baseline,
- the event falls outside business hours for the user's
time zone / tenant policy.
4. Volume above 10 MB.

Pair with `snowflake_bind_param_audit_gap.yml` for sessions where bind
parameters degrade the audit signal — the Trail event preserves the
same bind-parameter blindness on `QUERY_TEXT` that ACCOUNT_USAGE does,
so the pair coverage is needed regardless of the ingestion surface.

**Known sensitivity gap** (identical to the ACCOUNT_USAGE rule): an
attacker who has stolen credentials for an approved bulk-exporter role
and exfils inside that role's documented business-hours window at a
volume below the role's p90 baseline is invisible to this rule unless
the destination stage is flagged at a higher signal level than the
current outer-OR gating. The `fp_fn_harness/bulk_exfil_baseline.py`
harness measures this gap; the recommended remediation is documented
there (promote `external_stage_in_watchlist` to a fire signal, or add
a `stage_outside_corp_namespace` enrichment field).

**Deployment posture.** This rule activates only where Snowflake Trail
is enabled and the `query_events` event family is subscribed. Where
Trail is not yet wired up, the customer's interim coverage is the
streaming-ingest sidecar documented under
`detection/snowflake/streaming-ingest/`, which polls
`INFORMATION_SCHEMA.QUERY_HISTORY()` on a 60-second cadence and
produces equivalent signal at ~minute-scale latency (rather than the
Trail rule's seconds-scale).
references:
- https://cloud.google.com/blog/topics/threat-intelligence/unc5537-snowflake-data-theft-extortion
- https://docs.snowflake.com/en/user-guide/snowflake-trail
- https://docs.snowflake.com/en/sql-reference/sql/copy-into-location
author: security-research
date: 2026-05-15
tags:
- attack.exfiltration
- attack.t1567.002
enrichment:
required:
- external_stage_in_watchlist
- role_in_approved_bulk_exporter_set
- volume_above_role_baseline
- outside_business_hours
doc: ../ENRICHMENT.md
logsource:
product: snowflake_trail
service: query_events
detection:
copy_to_external:
event_type: 'query.snowflake.completed'
query_type|startswith: 'COPY'
query_text|contains: '@'
external_stage_not_in_watchlist:
external_stage_in_watchlist: false
role_off_baseline:
role_in_approved_bulk_exporter_set: false
volume_above_baseline:
volume_above_role_baseline: true
off_hours:
outside_business_hours: true
size_floor:
bytes_written_to_result|gte: 10485760 # 10 MB lower floor
condition: >
copy_to_external
and external_stage_not_in_watchlist
and size_floor
and (role_off_baseline or volume_above_baseline or off_hours)
fields:
- event_timestamp
- user_name
- role_name
- session_id
- query_text
- bytes_written_to_result
- rows_produced
- external_stage_url
- role_in_approved_bulk_exporter_set
- volume_above_role_baseline
- outside_business_hours
falsepositives:
- Legitimate first-run of a new pipeline that loads from / unloads to
a freshly-created external stage. Maintain a 24h grace +
on-call notification — the rule should warn (not page) until the
new stage is added to the watchlist.
- Genuinely novel ad-hoc exports from approved bulk-exporter roles
during declared incidents (DR, data-migration, etc.). Tag the
incident window so the rule's `off_hours` signal does not stack
with operational urgency.
- Roles that have legitimate after-hours export windows (overnight
EHR refreshes). Define the `outside_business_hours` calculation
per role, not per tenant — the enrichment doc names this pattern.
level: high
14 changes: 11 additions & 3 deletions detection/snowflake/sigma/native_app_unexpected_version_bump.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,14 @@ description: |
provider account pushes a new manifest; consumers with auto-update
enabled receive it without re-consent. The NAAAPS scan is the upstream
control; the consumer-side detection is the version-bump diff.

Structural contract: `manifest_diff_added` is a list of prefixed
tokens emitted by the application_history projection
(`PRIVILEGE:<name>`, `EXTERNAL ACCESS INTEGRATION:<name>`,
`EXTERNAL FUNCTION:<name>`, `CONTAINER:<image>`). The rule uses
`|startswith` against the prefix list so the detection binds to the
structural contract; free-text occurrences of these tokens in other
fields will not fire the rule.
references:
- https://docs.snowflake.com/en/developer-guide/native-apps/security-overview
- https://docs.snowflake.com/en/developer-guide/native-apps/security-cve
Expand All @@ -34,9 +42,9 @@ detection:
event_type: APP_VERSION_INSTALLED
auto_upgrade: true
new_eai_or_extfn:
manifest_diff_added|contains:
- 'EXTERNAL ACCESS INTEGRATION'
- 'EXTERNAL FUNCTION'
manifest_diff_added|startswith:
- 'EXTERNAL ACCESS INTEGRATION:'
- 'EXTERNAL FUNCTION:'
condition: app_upgraded and new_eai_or_extfn
fields:
- event_timestamp
Expand Down
Loading
Loading