feat: SPOG (Single Point of Gateway) host support#1479
Open
sd-db wants to merge 2 commits into
Open
Conversation
Implements support for Databricks SPOG hosts — account-level vanity URLs
(e.g. peco.azuredatabricks.net) where workspaces are disambiguated by a
`?o=<workspace-id>` query parameter on http_path. Approach matches the
convention adopted by databricks-sql-python, databricks-sql-go,
databricks-jdbc, and the ADBC Rust driver: parse ?o= from http_path and
use it to set the X-Databricks-Org-Id header on non-OAuth endpoints.
Opt-in via the dependency ceiling bumps already landed: requires
`databricks-sql-connector >= 4.2.6` and `databricks-sdk >= 0.104.0` for
the SPOG code path to activate. Pre-SPOG dep versions continue to work
unchanged on legacy hosts.
- `extract.py` — pure parser; pulls ?o=<workspace-id> from http_path.
- `capabilities.py` — runtime detect SPOG support: `connector_supports_spog`
(version-detect with packaging.version.Version), `sdk_supports_workspace_id`
(feature-detect via inspect.signature(Config) so forks/wrappers report
correctly).
- `probe.py` — one-shot per-host probe of /.well-known/databricks-config.
3-attempt exponential backoff; on exhaust returns HostMetadata(host_type=None)
+ WARN. Probe failure is never fatal.
- `decision.py` — applies the spec §8 decision matrix at connection.open():
raises DbtConfigError on every misconfiguration row with a pointed
upgrade/fix message; returns the extracted workspace_id on the happy path.
- `credentials.py`:
- Cluster-ID regex tightened: `(.*)` -> `([^?&]+)` so the capture stops
at any query string (independently useful even on legacy hosts).
- DatabricksCredentialManager gains a `workspace_id` field populated by
`extract_workspace_id(credentials.http_path)` in create_from.
- All five `authenticate_with_*` methods now plumb workspace_id into
`Config(...)` via a single `_config_kwargs` helper — gated on
`sdk_supports_workspace_id()` so old SDKs are unaffected.
- `connections.py`: `DatabricksConnectionManager.open()` collects every
http_path in play (default + per-compute) and invokes
`check_spog_preconditions(host=..., http_paths=...)` before constructing
conn_args. On legacy hosts the call is a no-op; on misconfiguration it
raises a pointed DbtConfigError.
- `impl.py`: `DatabricksAdapter.debug_query` override emits a SPOG status
block (host_type, workspace_id, dep-version suitability) before the
standard `select 1 as id`. Makes 'is SPOG working here?' a one-command
answer for support escalations.
- 35 unit tests under `tests/unit/spog/` covering every §8 matrix row,
retry/backoff math, capability detection branches, both-deps-old
ordering, HTTP-error retry fallback, and probe caching.
- 17 cross-module unit tests (workspace_id plumbing, connection.open
wiring, dbt debug block, cluster-id regex).
- 3 functional tests under `tests/functional/adapter/spog/`:
- test_spog_debug — assert dbt debug emits the SPOG block.
- test_spog_missing_o_raises — strip ?o=, expect the §8 row-4 error.
- test_spog_probe_failure_fallback — simulate probe failure; expect
WARN + run still succeeds.
All three skip when DBT_DATABRICKS_SPOG_* env vars are absent.
`.github/workflows/spog-integration.yml` runs the SPOG functional tests
against `peco.azuredatabricks.net?o=6436897454825492` using the existing
Azure secrets (same workspace; only host + ?o= suffix differ). Forces
SPOG-capable connector and SDK pins. Triggered weekly + workflow_dispatch.
Design spec at `docs/superpowers/specs/2026-05-19-dbt-databricks-spog-design.md`;
implementation plan at `docs/superpowers/plans/2026-05-19-dbt-databricks-spog.md`;
follow-up items tracked in `.claude/ideas/spog-future-tasks.md` (gitignored,
local-only). CHANGELOG entry added under `dbt-databricks next`.
- test_python_helpers: stub Mock() credentials.http_path with a real string so extract_workspace_id() (now called in DatabricksCredentialManager.create_from) doesn't trip on "argument of type 'Mock' is not iterable". - test_auth (TestEnsureConfigTriggersTheRightAuth): autouse-patch sdk_supports_workspace_id() to False so the auth-routing assertions stay focused on auth_type. SPOG workspace_id plumbing has its own coverage in tests/unit/spog/.
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds SPOG (Single Point of Gateway) support — account-level vanity hosts like
peco.azuredatabricks.netwhere workspaces are disambiguated by?o=<workspace-id>onhttp_path. Convention matchesdatabricks-sql-python(#767),databricks-sql-go,databricks-jdbc, and the ADBC Rust driver.Builds on the dep-ceiling bumps already in
main(#1474). Feature is opt-in via those bumps: activates only whendatabricks-sql-connector ≥ 4.2.6anddatabricks-sdk ≥ 0.104.0are installed. Pre-SPOG dep versions continue to work unchanged on legacy hosts — non-SPOG users see no behavior change.What changes
dbt/adapters/databricks/spog/:extract— parse?o=(or fall back to/o/<id>/in cluster paths) fromhttp_path.capabilities— runtime detect:connector_supports_spog(PEP-440 version compare),sdk_supports_workspace_id(feature-detect viainspect.signature(Config)).probe— one-shotGET /.well-known/databricks-configper host with 3-attempt backoff; probe failure is non-fatal.decision— applies the §8 decision matrix atconnection.open(); raisesDbtConfigErrorwith a pointed upgrade/fix message on every misconfig row.credentials.py:(.*)→([^?&]+)so the capture stops at any query string (independently useful even on legacy hosts).DatabricksCredentialManagergains aworkspace_idfield populated byextract_workspace_id(credentials.http_path).authenticate_with_*methods plumbworkspace_idintoConfig(...)via a single_config_kwargshelper, gated onsdk_supports_workspace_id()so old SDKs are unaffected.connections.py:DatabricksConnectionManager.open()collects everyhttp_pathin play (default + per-compute) and invokescheck_spog_preconditions(...)before constructingconn_args. No-op on legacy hosts; pointedDbtConfigErroron misconfig.impl.py:DatabricksAdapter.debug_queryoverride emits a SPOG status block (host_type, workspace_id, dep-version suitability) before the standardselect 1 as id— makes "is SPOG working here?" a one-command answer viadbt debug.Misconfiguration handling
Each row in §8 of the design fails fast at
connection.open()with aDbtConfigErrornaming the file/field to fix:?o=present?http_pathis missing?o=<workspace-id>databricks-sql-connector/databricks-sdk?o=fromhttp_path(or fix host)Design doc
docs/superpowers/specs/2026-05-19-dbt-databricks-spog-design.md(committed in this PR) holds the full spec — background, the §8 decision matrix, all the upstream PRs referenced, and rationale for opt-in via ceiling bumps.Test plan
hatch run unit tests/unit -q) — 1174 passed, 6 skippedpre-commit run --all-filespassesdatabricks_uc_sql_endpoint,databricks_uc_cluster,databricks_cluster).github/workflows/spog-integration.yml, manual / scheduled — points atpeco.azuredatabricks.net)dbt debugexercises the new SPOG status block on both SPOG and legacy hosts