Commit 59be21b
committed
xds: synchronize ext_proc message draining during stream termination
I identified the following synchronization issues by analyzing the potential race conditions between the thread handling external processor responses (or stream completion) and the application thread calling sendMessage().
Specifically, here is how the two major race conditions occurred during state transitions:
1. The isExtProcStreamCompleted() Check Bypass Race
In the previous implementation, sendMessage(InputStream message) started with:
java
if (passThroughMode.get() || isExtProcStreamCompleted()) {
super.sendMessage(message);
return;
}
When the external processor stream terminated (e.g., via onCompleted()), the event listener immediately called markExtProcStreamCompleted(). This updated the extProcStreamState atomic reference to COMPLETED before handleFailOpen() ran drainPendingDrainingMessages(). If an application thread called sendMessage() in that exact window, isExtProcStreamCompleted() returned true. The application thread bypassed pendingDrainingMessages and sent the message directly over the wire (super.sendMessage). Shortly after, handleFailOpen() flushed the queue, delivering previously buffered draining messages to the wire after the newer message, resulting in out-of-order (FIFO violation) message delivery.
2. The Queue Flushed / Preemption Race
If an application thread entered sendMessage() while the stream was in the DRAINING state, it evaluated:
java
if (isExtProcStreamDraining()) {
pendingDrainingMessages.add(message);
return;
}
If the application thread was preempted right before executing pendingDrainingMessages.add(message), the external processor stream could complete in the background on another thread. E.g., handleFailOpen() would run, invoke drainPendingDrainingMessages() (which found the queue empty), set passThroughMode = true, and finish. When the application thread resumed, it added its message to pendingDrainingMessages. Because drainPendingDrainingMessages() had already executed, that message sat in the queue indefinitely and was dropped.
The Solution
By wrapping the state checks and queue additions in sendMessage() with synchronized (streamLock), and similarly synchronizing drainPendingDrainingMessages() (where passThroughMode = true is updated atomically with the queue flush), we eliminate both race windows and ensure 100% robust, FIFO-ordered delivery across all stream transitions.1 parent b36e92f commit 59be21b
1 file changed
Lines changed: 18 additions & 8 deletions
Lines changed: 18 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1344 | 1344 | | |
1345 | 1345 | | |
1346 | 1346 | | |
1347 | | - | |
| 1347 | + | |
1348 | 1348 | | |
1349 | 1349 | | |
1350 | 1350 | | |
1351 | 1351 | | |
1352 | | - | |
1353 | | - | |
1354 | | - | |
| 1352 | + | |
| 1353 | + | |
| 1354 | + | |
| 1355 | + | |
| 1356 | + | |
| 1357 | + | |
| 1358 | + | |
| 1359 | + | |
| 1360 | + | |
| 1361 | + | |
1355 | 1362 | | |
1356 | 1363 | | |
1357 | 1364 | | |
| |||
1495 | 1502 | | |
1496 | 1503 | | |
1497 | 1504 | | |
1498 | | - | |
1499 | | - | |
1500 | | - | |
| 1505 | + | |
| 1506 | + | |
| 1507 | + | |
| 1508 | + | |
| 1509 | + | |
| 1510 | + | |
1501 | 1511 | | |
1502 | 1512 | | |
1503 | 1513 | | |
| |||
1722 | 1732 | | |
1723 | 1733 | | |
1724 | 1734 | | |
1725 | | - | |
| 1735 | + | |
1726 | 1736 | | |
1727 | 1737 | | |
1728 | 1738 | | |
| |||
0 commit comments