Background
Datasets are "derived views" — virtual SQL tables built from SQL queries over workspace data. With the introduction of the databases API (PR #94), queries are now scoped via the X-Database-Id header so that default resolves to a specific database's catalog.
Current behavior
X-Database-Id is sent on every API request when a current database is set, including all /datasets endpoints (list, create, refresh, etc.). This is because ApiClient appends the header unconditionally.
However, datasets appear to be workspace-scoped, not database-scoped:
- They live in the
datasets catalog, addressed as datasets.main.<table_name> — separate from default.public.* used by database tables
GET /datasets returned all 46 workspace datasets regardless of which database was set
The tension
There are two separate concerns that need to be resolved:
-
Dataset listing/creation — Should /datasets be scoped to a database? Currently the server appears to ignore X-Database-Id here and returns all workspace datasets. This may be intentional (datasets are workspace-level) or unintentional.
-
SQL query execution inside a dataset — If a dataset's source SQL references database tables (e.g. SELECT * FROM default.public.trips), the X-Database-Id header is needed so the query engine resolves default to the right catalog. Without it, the query would fail or hit the wrong catalog.
What needs to happen
Related
Background
Datasets are "derived views" — virtual SQL tables built from SQL queries over workspace data. With the introduction of the databases API (PR #94), queries are now scoped via the
X-Database-Idheader so thatdefaultresolves to a specific database's catalog.Current behavior
X-Database-Idis sent on every API request when a current database is set, including all/datasetsendpoints (list, create, refresh, etc.). This is becauseApiClientappends the header unconditionally.However, datasets appear to be workspace-scoped, not database-scoped:
datasetscatalog, addressed asdatasets.main.<table_name>— separate fromdefault.public.*used by database tablesGET /datasetsreturned all 46 workspace datasets regardless of which database was setThe tension
There are two separate concerns that need to be resolved:
Dataset listing/creation — Should
/datasetsbe scoped to a database? Currently the server appears to ignoreX-Database-Idhere and returns all workspace datasets. This may be intentional (datasets are workspace-level) or unintentional.SQL query execution inside a dataset — If a dataset's source SQL references database tables (e.g.
SELECT * FROM default.public.trips), theX-Database-Idheader is needed so the query engine resolvesdefaultto the right catalog. Without it, the query would fail or hit the wrong catalog.What needs to happen
/datasetslist/create/refresh respectX-Database-Id?database_idfield on the dataset record)--databaseflag todatasets create(similar todatabases tables load)X-Database-Idon dataset endpoints to avoid unintended side effects, but still passes it through when executing the source SQL queryRelated
X-Database-Idheadercreateto SQL/query-id only