-
-
Notifications
You must be signed in to change notification settings - Fork 75
(Improvements, need help testing) Rune related and span-ish optimizations #1247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
ations and optimized rune operations
|
Personally most of the changes seem to make sense, but I would make the case that the Auth part does become way to complex with these changes (and also not sure how often that code even runs, cause I suppose it runs once per connection so probably not too hot of a path) The other parts do seem to make sense, especially the ReaderWriter optimizations, as those run for each query. Did you however happen to run the benchmarks against this to see what actual change it makes to performance? |
|
Unfortunately, I haven't got to running benchmarks yet, however changes resulted in significant reduction of cpu time usage and allocations in application performance profiling runs, I will try to perform more thorough benchmarks and correctness tests soon, when I will have some free time |
|
Also I agree that auth part is a case of over-optimization and can be omitted. I just applied change pattern to everything which allocates temporary buffers and I have got an eye on. So optimizations for things which run once per session/connection aren't necessary |
|
Upd: I have run the Perf thing I found in a solution (idk if it is any representative) And yeah, the speed difference is pretty negligible, however reduction in allocations can be clearly observed Also firebird 3 is used, disk used is OEM samsung nvme 2tb (pm9a1, aka oem 980 pro), 32gb of ddr4 ram @3200MT JEDEC, dual channel ofc |
…ed static, breaking tests)
|
Upd2: Ran tests with firebird 3 (no embedded), so it does need further testing with other versions (especially embedded and batch operations in modern fb), there was an issue with boolean reading due to _smallbuffer being used both for reading useful bytes and pad (which doesn't affect types which don't get padded). Also, small test run time reduction was observed (aka 24.1 -> 23.5 mins, but without repeatability checks) and no changes in pass/failed/skipped numbers were noticed (after the fix) |
|
Upd3: performed tests with embedded engine, all passed |
|
Upd4: Benchmark results: It was a little bit tricky to actually obtain measurements which could show improvements, however some interesting observations can be made. |
I want to propose set of changes aimed at improving performance, which I have implemented and used for some time in my (private) projects.
The main goal of these changes is to significantly reduce allocations to heap by using stack allocations, array pool and avoiding unnecessary allocations in first place.
I have created topic in mailing list
I will appreciate opinions and help with testing, as I was used these changes for a while without any anomalies, though I didn't run thorough tests with all versions (I am using fb 3 server). Also the changes shouldn't have changed observable behavior.