4.x: Route LOCAL_SERIAL statements as LWT requests#886
Conversation
302b72c to
035c3ef
Compare
035c3ef to
a4f78b4
Compare
nikagra
left a comment
There was a problem hiding this comment.
Design concern: scattered inference logic and inconsistent field checks
Thanks for the fix! The underlying problem is real — LOCAL_SERIAL statements should be routed as LWT. However, @nikagra raised concerns about the current approach of duplicating the "serial consistency → LWT" inference into every statement class's getRequestRoutingType(), and after analyzing the code, Copilot identified the following issues:
1. Inconsistent fields checked across statement types
Each class checks a different field, and it's not clear which is correct:
DefaultSimpleStatementchecksserialConsistencyLevelDefaultBoundStatementchecksconsistencyLevelDefaultPreparedStatementchecksconsistencyLevelForBoundStatements
The bug scenario is a user setting LOCAL_SERIAL as the main consistency level (a linearizable read). DefaultSimpleStatement checks the wrong field for this — it checks serialConsistencyLevel, which wouldn't catch SimpleStatement.builder(...).setConsistencyLevel(LOCAL_SERIAL). Meanwhile DefaultBoundStatement checks only consistencyLevel but not serialConsistencyLevel, missing LWT writes that set serial CL explicitly.
2. null return contract change
Conversions.toPreparedStatement() now returns null instead of RequestRoutingType.REGULAR as the non-LWT fallback. This changes the semantic meaning of null from "not explicitly set" to "unknown, please infer downstream," which ripples through DefaultBatchStatement's child-scanning logic.
3. Complexity pushed into DefaultBatchStatement
As @nikagra pointed out, DefaultBatchStatement.getRequestRoutingType() already handles nullable child routing types. Now that children return null more often, the batch inference becomes harder to reason about — a mix of null, LWT, and REGULAR from children, some of which have done their own consistency-based inference, some of which haven't.
Alternative: centralize in BasicLoadBalancingPolicy
The only consumer that acts on getRequestRoutingType() for routing decisions is BasicLoadBalancingPolicy.getRequestRoutingMethod(). @nikagra and Copilot suggest adding the serial-consistency check there instead:
@NonNull
public RequestRoutingMethod getRequestRoutingMethod(@Nullable Request request) {
if (request == null) {
return RequestRoutingMethod.REGULAR;
}
if (request.getRequestRoutingType() == RequestRoutingType.LWT
|| isEffectivelySerial(request)) {
return lwtRequestRoutingMethod;
}
return RequestRoutingMethod.REGULAR;
}
private boolean isEffectivelySerial(@NonNull Request request) {
if (request instanceof Statement) {
Statement<?> stmt = (Statement<?>) request;
ConsistencyLevel cl = stmt.getConsistencyLevel();
if (cl != null && cl.isSerial()) return true;
ConsistencyLevel scl = stmt.getSerialConsistencyLevel();
if (scl != null && scl.isSerial()) return true;
}
return false;
}Benefits:
- No contract changes to
getRequestRoutingType()— statements keep returning what the server told us - Single place to reason about and test the routing rule
- No ripple into
DefaultBatchStatement - Both
consistencyLevelandserialConsistencyLevelchecked uniformly
This is fixed.
That is fine, it was actually bug we did not catch when this feature was introduced.
Not really, inference code in DefaultBatchStatement.getRequestRoutingType wasn't change, it bails out on first LWT query, if not returns REGULAR, which means that it's behavior stays the same.
Pushing this solution to BasicLoadBalancingPolicy will make driver on every iteration recalculate |
857526f to
7c4bdfb
Compare
nikagra
left a comment
There was a problem hiding this comment.
👏 Nice fix! Thorough test coverage too 💪 Approving! ✅
7e0c5e3 to
935f2dd
Compare
Driver did not consider SERIAL SELECT as LWT and therefore routed them as regular queries causing LWT congestion. Fix is to consider consistency when RequestRoutingMethod is calculated.
935f2dd to
c5720fa
Compare
Fixes #885
Fixes: https://scylladb.atlassian.net/browse/DRIVER-615
Problem
Statements that use SERIAL or LOCAL_SERIAL consistency can be LWT requests, but the driver only selected the LWT load-balancing route when RequestRoutingType.LWT was already set. Prepared SELECTs built from a SimpleStatement with LOCAL_SERIAL could remain RequestRoutingType.REGULAR, so LOAD_BALANCING_DEFAULT_LWT_REQUEST_ROUTING_METHOD was not applied.
Changes
Tests