Skip to content

4.x: Route LOCAL_SERIAL statements as LWT requests#886

Merged
dkropachev merged 1 commit into
scylla-4.xfrom
fix-local-serial-lwt-routing
May 13, 2026
Merged

4.x: Route LOCAL_SERIAL statements as LWT requests#886
dkropachev merged 1 commit into
scylla-4.xfrom
fix-local-serial-lwt-routing

Conversation

@dkropachev
Copy link
Copy Markdown

@dkropachev dkropachev commented May 12, 2026

Fixes #885
Fixes: https://scylladb.atlassian.net/browse/DRIVER-615

Problem

Statements that use SERIAL or LOCAL_SERIAL consistency can be LWT requests, but the driver only selected the LWT load-balancing route when RequestRoutingType.LWT was already set. Prepared SELECTs built from a SimpleStatement with LOCAL_SERIAL could remain RequestRoutingType.REGULAR, so LOAD_BALANCING_DEFAULT_LWT_REQUEST_ROUTING_METHOD was not applied.

Changes

  • Treat statements with effective SERIAL or LOCAL_SERIAL consistency as LWT requests in BasicLoadBalancingPolicy, including execution-profile defaults.
  • Mark prepared statements as RequestRoutingType.LWT when their bound-statement consistency resolves to SERIAL or LOCAL_SERIAL.
  • Keep local-consistency DC failover handling aligned with the statement's effective profile consistency.
  • Exclude down replicas from preserve-replica query plans.
  • Add unit, Scylla CCM, and statement-matrix coverage for prepared SimpleStatement LOCAL_SERIAL SELECTs.
  • Run the LWT tablet statement matrix on Scylla 2026.1 and newer.

Tests

  • repo-ci fast under Java 8.0.452-amzn: passed (artifact 72e5f55e74556eb4052ab678654b18bf5edf974c047ba87da4b411c29320272b).
  • GitHub CI is running for the latest pushed commit.

@dkropachev dkropachev marked this pull request as draft May 12, 2026 22:56
@dkropachev dkropachev force-pushed the fix-local-serial-lwt-routing branch 2 times, most recently from 302b72c to 035c3ef Compare May 13, 2026 04:09
@dkropachev dkropachev marked this pull request as ready for review May 13, 2026 04:16
@dkropachev dkropachev requested a review from nikagra May 13, 2026 04:17
@dkropachev dkropachev self-assigned this May 13, 2026
@dkropachev dkropachev force-pushed the fix-local-serial-lwt-routing branch from 035c3ef to a4f78b4 Compare May 13, 2026 04:31
Copy link
Copy Markdown

@nikagra nikagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Design concern: scattered inference logic and inconsistent field checks

Thanks for the fix! The underlying problem is real — LOCAL_SERIAL statements should be routed as LWT. However, @nikagra raised concerns about the current approach of duplicating the "serial consistency → LWT" inference into every statement class's getRequestRoutingType(), and after analyzing the code, Copilot identified the following issues:

1. Inconsistent fields checked across statement types

Each class checks a different field, and it's not clear which is correct:

  • DefaultSimpleStatement checks serialConsistencyLevel
  • DefaultBoundStatement checks consistencyLevel
  • DefaultPreparedStatement checks consistencyLevelForBoundStatements

The bug scenario is a user setting LOCAL_SERIAL as the main consistency level (a linearizable read). DefaultSimpleStatement checks the wrong field for this — it checks serialConsistencyLevel, which wouldn't catch SimpleStatement.builder(...).setConsistencyLevel(LOCAL_SERIAL). Meanwhile DefaultBoundStatement checks only consistencyLevel but not serialConsistencyLevel, missing LWT writes that set serial CL explicitly.

2. null return contract change

Conversions.toPreparedStatement() now returns null instead of RequestRoutingType.REGULAR as the non-LWT fallback. This changes the semantic meaning of null from "not explicitly set" to "unknown, please infer downstream," which ripples through DefaultBatchStatement's child-scanning logic.

3. Complexity pushed into DefaultBatchStatement

As @nikagra pointed out, DefaultBatchStatement.getRequestRoutingType() already handles nullable child routing types. Now that children return null more often, the batch inference becomes harder to reason about — a mix of null, LWT, and REGULAR from children, some of which have done their own consistency-based inference, some of which haven't.

Alternative: centralize in BasicLoadBalancingPolicy

The only consumer that acts on getRequestRoutingType() for routing decisions is BasicLoadBalancingPolicy.getRequestRoutingMethod(). @nikagra and Copilot suggest adding the serial-consistency check there instead:

@NonNull
public RequestRoutingMethod getRequestRoutingMethod(@Nullable Request request) {
    if (request == null) {
      return RequestRoutingMethod.REGULAR;
    }
    if (request.getRequestRoutingType() == RequestRoutingType.LWT
        || isEffectivelySerial(request)) {
      return lwtRequestRoutingMethod;
    }
    return RequestRoutingMethod.REGULAR;
}

private boolean isEffectivelySerial(@NonNull Request request) {
    if (request instanceof Statement) {
      Statement<?> stmt = (Statement<?>) request;
      ConsistencyLevel cl = stmt.getConsistencyLevel();
      if (cl != null && cl.isSerial()) return true;
      ConsistencyLevel scl = stmt.getSerialConsistencyLevel();
      if (scl != null && scl.isSerial()) return true;
    }
    return false;
}

Benefits:

  • No contract changes to getRequestRoutingType() — statements keep returning what the server told us
  • Single place to reason about and test the routing rule
  • No ripple into DefaultBatchStatement
  • Both consistencyLevel and serialConsistencyLevel checked uniformly

@dkropachev
Copy link
Copy Markdown
Author

Design concern: scattered inference logic and inconsistent field checks

Thanks for the fix! The underlying problem is real — LOCAL_SERIAL statements should be routed as LWT. However, @nikagra raised concerns about the current approach of duplicating the "serial consistency → LWT" inference into every statement class's getRequestRoutingType(), and after analyzing the code, Copilot identified the following issues:

1. Inconsistent fields checked across statement types

Each class checks a different field, and it's not clear which is correct:

  • DefaultSimpleStatement checks serialConsistencyLevel
  • DefaultBoundStatement checks consistencyLevel
  • DefaultPreparedStatement checks consistencyLevelForBoundStatements

The bug scenario is a user setting LOCAL_SERIAL as the main consistency level (a linearizable read). DefaultSimpleStatement checks the wrong field for this — it checks serialConsistencyLevel, which wouldn't catch SimpleStatement.builder(...).setConsistencyLevel(LOCAL_SERIAL). Meanwhile DefaultBoundStatement checks only consistencyLevel but not serialConsistencyLevel, missing LWT writes that set serial CL explicitly.

This is fixed.

2. null return contract change

Conversions.toPreparedStatement() now returns null instead of RequestRoutingType.REGULAR as the non-LWT fallback. This changes the semantic meaning of null from "not explicitly set" to "unknown, please infer downstream," which ripples through DefaultBatchStatement's child-scanning logic.

That is fine, it was actually bug we did not catch when this feature was introduced.

3. Complexity pushed into DefaultBatchStatement

As @nikagra pointed out, DefaultBatchStatement.getRequestRoutingType() already handles nullable child routing types. Now that children return null more often, the batch inference becomes harder to reason about — a mix of null, LWT, and REGULAR from children, some of which have done their own consistency-based inference, some of which haven't.

Not really, inference code in DefaultBatchStatement.getRequestRoutingType wasn't change, it bails out on first LWT query, if not returns REGULAR, which means that it's behavior stays the same.

Alternative: centralize in BasicLoadBalancingPolicy

The only consumer that acts on getRequestRoutingType() for routing decisions is BasicLoadBalancingPolicy.getRequestRoutingMethod(). @nikagra and Copilot suggest adding the serial-consistency check there instead:

@NonNull
public RequestRoutingMethod getRequestRoutingMethod(@Nullable Request request) {
    if (request == null) {
      return RequestRoutingMethod.REGULAR;
    }
    if (request.getRequestRoutingType() == RequestRoutingType.LWT
        || isEffectivelySerial(request)) {
      return lwtRequestRoutingMethod;
    }
    return RequestRoutingMethod.REGULAR;
}

private boolean isEffectivelySerial(@NonNull Request request) {
    if (request instanceof Statement) {
      Statement<?> stmt = (Statement<?>) request;
      ConsistencyLevel cl = stmt.getConsistencyLevel();
      if (cl != null && cl.isSerial()) return true;
      ConsistencyLevel scl = stmt.getSerialConsistencyLevel();
      if (scl != null && scl.isSerial()) return true;
    }
    return false;
}

Benefits:

  • No contract changes to getRequestRoutingType() — statements keep returning what the server told us
  • Single place to reason about and test the routing rule
  • No ripple into DefaultBatchStatement
  • Both consistencyLevel and serialConsistencyLevel checked uniformly

Pushing this solution to BasicLoadBalancingPolicy will make driver on every iteration recalculate requestRoutingType, whole idea of having requestRoutingType in statement is that it will be resolved before it goes to LBP.

@dkropachev dkropachev force-pushed the fix-local-serial-lwt-routing branch from 857526f to 7c4bdfb Compare May 13, 2026 17:16
Copy link
Copy Markdown

@nikagra nikagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 Nice fix! Thorough test coverage too 💪 Approving! ✅

@dkropachev dkropachev force-pushed the fix-local-serial-lwt-routing branch 3 times, most recently from 7e0c5e3 to 935f2dd Compare May 13, 2026 20:10
Driver did not consider SERIAL SELECT as LWT and therefore routed them
as regular queries causing LWT congestion.
Fix is to consider consistency when RequestRoutingMethod is calculated.
@dkropachev dkropachev force-pushed the fix-local-serial-lwt-routing branch from 935f2dd to c5720fa Compare May 13, 2026 20:19
@dkropachev dkropachev merged commit 0ae5b7a into scylla-4.x May 13, 2026
24 checks passed
@dkropachev dkropachev deleted the fix-local-serial-lwt-routing branch May 13, 2026 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4.x: Driver does not route LOCAL_SERIAL statements as LWT requests

2 participants