perf(component): Add LRU caching for class discovery methods#106
perf(component): Add LRU caching for class discovery methods#106khresth wants to merge 2 commits intooracle:mainfrom
Conversation
|
Thank you for your pull request and welcome to our community! To contribute, please sign the Oracle Contributor Agreement (OCA).
To sign the OCA, please create an Oracle account and sign the OCA in Oracle's Contributor Agreement Application. When signing the OCA, please provide your GitHub username. After signing the OCA and getting an OCA approval from Oracle, this PR will be automatically updated. If you are an Oracle employee, please make sure that you are a member of the main Oracle GitHub organization, and your membership in this organization is public. |
|
Thank you @khresth for your contribution! Do you have any improvement numbers maybe? |
|
I created a benchmark to measure the caching benefit. Here are the results: get_class_from_name() With caching: ~0.0002ms per lookup With caching: ~0.0003ms per call 3,000 component lookups: 0.0012 seconds total The repeated class lookups that previously required traversing the Component hierarchy (O(n) where n = number of classes) now complete in O(1) time via dictionary lookup. This improves deserialization performance for complex flows with many nested components. |
|
Hi @khresth , Thanks a lot for your optimization suggestion. This is very appreciated. I see that your benchmark script does not actually evaluates deserialization, so I tried evaluating it using the commands below. I find that using the code from your branch does not seem to impact the total runtime compared to The look-ups that you optimized seem to only take a few microseconds, so they are unlikely to improve the deserialization that takes several milliseconds. Did you try to profile any of the (i) Component deserialization (ii) JSON schema generation or (iii) Serialization context operations to find which methods actually significantly impact performance? I also don't see an improvement when deserializing the same configuration three times in a row: |
Summary
Optimize Component class discovery by adding LRU caching to frequently-used methods.
Problem
Component.get_class_from_name() and Component._get_all_subclasses() in pyagentspec/src/pyagentspec/component.py perform BFS traversal of the class hierarchy on every call. During serialization/deserialization of complex flows, these methods are called repeatedly with the same arguments, causing unnecessary CPU overhead.
Solution
Add
@functools.lru_cacheto both methods:Cache invalidation is handled automatically by clearing both caches in Component.init_subclass() when new Component subclasses are dynamically created (e.g., in tests or custom components).
Testing
Performance Impact
Repeated calls to these methods now return cached results instead of traversing the class hierarchy each time, providing measurable speedup during: