Skip to content

Add MetadataOperationTimeout support for SHOW commands metadata path#1417

Open
gopalldb wants to merge 7 commits intodatabricks:mainfrom
gopalldb:fix/metadata-timeout-show-commands
Open

Add MetadataOperationTimeout support for SHOW commands metadata path#1417
gopalldb wants to merge 7 commits intodatabricks:mainfrom
gopalldb:fix/metadata-timeout-show-commands

Conversation

@gopalldb
Copy link
Copy Markdown
Collaborator

Summary

MetadataOperationTimeout (default 300s) only worked for native Thrift RPCs. With UseQueryForMetadata=1 (new default for warehouses), metadata operations had no timeout — they could hang indefinitely because parentStatement=null results in timeoutInSeconds=0.

Root Cause

Both DatabricksSdkClient.executeStatement() and DatabricksThriftAccessor.pollTillOperationFinished() compute timeout as:

int timeoutInSeconds = (parentStatement == null) ? 0 : parentStatement.getStatement().getQueryTimeout();

Metadata calls pass parentStatement=null, so timeout was always 0 (infinite).

Fix

When parentStatement==null AND statementType==METADATA, read MetadataOperationTimeout from connectionContext instead of defaulting to 0. Uses OPERATION_TIMEOUT_ERROR code for metadata timeouts (matching native Thrift path).

Files Changed

  • DatabricksSdkClient.java — SEA metadata path timeout
  • DatabricksThriftAccessor.java — Thrift SHOW commands path timeout (added statementType param to pollTillOperationFinished)
  • DatabricksSdkClientTest.java — 2 new tests

Test plan

  • testMetadataOperationUsesMetadataTimeout — verifies MetadataOperationTimeout=1s triggers for metadata with parentStatement=null
  • testNonMetadataWithNullParentHasNoTimeout — verifies non-metadata with parentStatement=null still has no timeout
  • Full core suite: 3297 tests, 0 failures

This pull request was AI-assisted by Isaac.

When a user sets UseQueryForMetadata=0 or TreatMetadataCatalogNameAsPattern=1
without explicitly setting UseThriftClient, force Thrift mode. These params
require native Thrift RPCs and don't work with SEA.

If the user explicitly sets UseThriftClient=0 (wants SEA), we honour that
even though the metadata params won't have effect — user's explicit choice
takes priority.

Decision matrix:
- UseThriftClient=1 → Thrift (explicit)
- UseThriftClient=0 → SEA (explicit, metadata params ignored)
- UseQueryForMetadata=0 (no UseThriftClient) → Thrift (needs native RPCs)
- TreatMetadataCatalogNameAsPattern=1 (no UseThriftClient) → Thrift
- No metadata params (no UseThriftClient) → SAFE flag decides

13 new tests covering all combinations.

Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
Co-authored-by: Isaac
Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
- Replace weak assertNotNull with concrete assertions using VALID_URL_2
  (explicit SEA) to prove the metadata check doesn't trigger
- Add comment explaining testNoMetadataParams_defaultBehavior is testing
  non-interference, not the default path
- Add case-sensitivity tests: "false" != "0" and "true" != "1" — our
  strict .equals() matching doesn't trigger for string booleans
- All weak tests now assert a specific client type (SEA or THRIFT),
  proving the new check either triggered or didn't

Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
Co-authored-by: Isaac
Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
- Force-Thrift log: promoted from DEBUG to INFO when UseQueryForMetadata=0
  or TreatMetadataCatalogNameAsPattern=1 forces Thrift mode
- Explicit SEA conflict log: new INFO log when UseThriftClient=0 is set
  alongside Thrift-only metadata params, warning they will have no effect

Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
Co-authored-by: Isaac
Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
- Rename uqm/tcp to explicitQueryForMetadata/explicitCatalogAsPattern
- Promote SEA conflict log from INFO to WARN (user likely misconfigured)

Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
Co-authored-by: Isaac
Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
When UseQueryForMetadata=1 (default for warehouses), metadata operations
had no timeout because parentStatement=null → timeoutInSeconds=0.
Now when parentStatement is null and statementType is METADATA, the
driver reads MetadataOperationTimeout from connectionContext (default
300s), matching the native Thrift RPC behavior.

Changes:
- DatabricksSdkClient.executeStatement(): use MetadataOperationTimeout
  when parentStatement=null and statementType=METADATA
- DatabricksThriftAccessor.pollTillOperationFinished(): same fix for
  Thrift SHOW commands path (added statementType parameter)
- Use OPERATION_TIMEOUT_ERROR code for metadata timeouts (matches
  native Thrift path)

Tests:
- testMetadataOperationUsesMetadataTimeout: verifies 1s timeout
  triggers for metadata with parentStatement=null
- testNonMetadataWithNullParentHasNoTimeout: verifies non-metadata
  with parentStatement=null still has no timeout

Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
Co-authored-by: Isaac
Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant