Testing zero-copy bugs fixes (not for merging)#1156
Testing zero-copy bugs fixes (not for merging)#1156szetszwo wants to merge 173 commits intoapache:masterfrom
Conversation
…pcClientProtocolService. (apache#1026)
…he same peer as the current valid leader (apache#1024)
… released properly (apache#1023)
| private final Condition notEmpty = lock.newCondition(); | ||
|
|
||
| private boolean closed = false; | ||
|
|
There was a problem hiding this comment.
There is already a lock.
| private final Condition notEmpty = lock.newCondition(); | ||
|
|
||
| private boolean closed = false; | ||
|
|
|
Finally, it is able to pass all the tests (with a few retries). Note that there are probably some other zero copy bugs. Will fix them separately. |
|
This can pass all the tests (with a few retries). Since this change is quite big (56kB) and non-trivial, I will split this to a few JIRAs:
I will see if (2) and (3) needed to be further split. BTW, we should move |
|
This PR has been marked as stale due to 60 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in ~30 days. |
|
Thank you for your contribution. This PR is being closed due to inactivity. Please contact a maintainer if you would like to reopen it. |
This fix will be split into multiple JIRAs: RATIS-2164, RATIS-2151, RATIS-2173
The following are the bugs found so far:
LeakDetector: assertedallLeaksis non-empty but printed "allLeaks.size = 0"retain. Without callingretainat all, it is not a leak.SimpleTracingandAdvancedTracing: the methods should be synchronized.AdvancedTracingshould have a single track list instead ofretainsTracesandreleaseTraces.GrpcClientProtocolService.UnorderedRequestStreamObserver.processClientRequest(..)should use try-finally.GrpcLogAppender.appendLog(..)callsrelease()incorrectly for exception.LogAppenderDefault.sendAppendEntriesWithRetries(..)callsrelease()incorrectly for exception.LogSegmentcache can release an entry multiple times.LogSegment.loadCache(..)should callretain()for cache hit.SegmentedRaftLog.retainLog(..): between getting the entry and callingretain(), the entry can be released. The "fail to retain" exception, if there is any, can be ignored since It is the same as a cache miss. See RATIS-2159. TestRaftWithSimulatedRpc could "fail to retain". #1153SegmentedRaftLog.retainEntryWithData(..)should release for exception.SimpleStateMachine4Testingcan be released.LogSegment: New entries can be added after EntryCache is closed.MemoryRaftLoghas similar problems as inSegmentedRaftLog.SegmentedRaftLogWorkershould clean up unfinished tasks in the queue after stopped running.