Refactor to use libquic instead of oxenmq for SN connections#15
Open
jagerman wants to merge 7 commits intosession-foundation:devfrom
Open
Refactor to use libquic instead of oxenmq for SN connections#15jagerman wants to merge 7 commits intosession-foundation:devfrom
jagerman wants to merge 7 commits intosession-foundation:devfrom
Conversation
Using OMQ caused scalability problems, where we had to run several
OxenMQ instances and load balance across them because of the overhead of
going back to poll for incoming sockets across thousands of file
descriptors.
This switches the code to use QUIC instead of OMQ, and along the way
fixes some other issues and makes other improvements:
- bumps to libpqxx 8.x RC as it gets rid of a lot of the code that was
being used to convert values (now built-in with libpqxx).
- Initial loading of subscribers from the database is now considerably
faster, and no longer crashes:
- Row data is now loaded in parallel using multiple threads, and only
the final transfer into data structures (where deduplication
happens, and thus needs to be single-threaded) is in the main
thread. This improves startup time by roughly 3-4x.
- Fixed a crash that could be caused by a race condition where the
loading code threw an exception about expired entries if the entry
expired between the time the pre-loading database clean happened and
now.
- Add a workaround that does a full scan on SN composition changes
rather than taking the "optimized" path because that optimized path is
many times slower.
This will allow subscribe/unsubscribe requests to be submitted over Session Router in the future.
It was rather awkward to have to load the main SPNS via Python; this rewrites it to create a first-class `SPNS` binary instead, using an embedded Python interpreter for the one small part (initial config parsing) that still relies on Python (for parsing and sharing config with notifiers). Also: - removes the old firebase and older fcm notifier (firebase notification is now C++ and lives in its own repo) - install (new) binary, library, and pybind module, rather than dumping them in the top-level project. The binary + library are installed under CMAKE_INSTALL_PREFIX, and the python3 executable goes into the python user site library location (`~/.local/lib/python3.x/site-packages/spns/core.cpython-....so`)
This was waiting inside the loop, which meant the loop was blocked preventing the notifiers we are waiting for from actually being able to register. This fixes it by waiting outside the loop instead.
Some SNs seem to struggle with a single request for 5000; reducing it ought to make it a little easier on the network without being *too* slow to get up and running with initial subs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Using OMQ caused scalability problems, where we had to run several OxenMQ instances and load balance across them because of the overhead of going back to poll for incoming sockets across thousands of file descriptors.
This switches the code to use QUIC instead of OMQ, and along the way fixes some other issues and makes other improvements: