feat: bytecode caching #82

tac0turtle · 2025-12-08T10:57:05Z

Description

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Performance improvement
Refactoring

Related Issues

Fixes #(issue)

Checklist

I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
Any dependent changes have been merged and published

Testing

Additional Notes

claude · 2025-12-08T10:57:20Z

Claude finished @tac0turtle's task —— View job

Code Review for Bytecode Caching PR

I'm reviewing this pull request for code quality, potential bugs, performance considerations, security concerns, and test coverage.

Todo List:

Read and analyze main caching implementation
Review benchmark code
Check integration points
Assess security implications
Evaluate test coverage
Provide comprehensive feedback

Critical Issues Found:

🔴 Critical Performance Issues

Severe Concurrency Bottleneck (crates/ev-revm/src/cache.rs:116)
- The BytecodeCache::get() method acquires a write lock for read operations
- This completely negates the purpose of using RwLock and serializes all cache access
- Under high concurrency, this will become a major performance bottleneck
- Root cause: Custom LruCache::get() requires &mut self to update access counters
Inefficient LRU Eviction (crates/ev-revm/src/cache.rs:74-75)
- LRU eviction has O(N log N) complexity due to collecting and sorting all entries
- For the default cache size of 10,000 entries, this will cause significant latency spikes
- Root cause: Naive implementation that sorts entire cache on every eviction

⚠️ High Priority Issues

Misleading PR Description
- PR mentions "PinnedStorageCache" but this feature is completely absent from the code
- Only BytecodeCache is implemented, not the advertised storage pinning functionality
Benchmark Quality Issues (crates/ev-revm/benches/cache_benchmark.rs:146)
- Inefficient database population loop recreates MockDatabase on every iteration
- Could be simplified with a proper add_storage method pattern

🟡 Medium Priority Issues

Missing Integration Tests
- No integration tests specifically for caching behavior in crates/tests/
- Only unit tests in the cache module itself
- Should test cache behavior in realistic EVM execution scenarios
Limited Error Handling
- Cache operations assume locks will never be poisoned
- Should consider graceful degradation when cache is unavailable

✅ Positive Aspects

Good test coverage for the implemented functionality with comprehensive unit tests
Well-documented cache implementation with clear documentation
Proper integration into the payload builder with configurable cache capacity
Realistic benchmarks that test various cache scenarios
Security: No security vulnerabilities identified in the caching logic

🔧 Recommendations

Replace Custom LRU Implementation

// Consider using a production-ready crate like:
use lru::LruCache; // or moka::sync::Cache

Fix Concurrency Issues
- Use a lock-free or reader-writer optimized cache implementation
- Consider Arc<Mutex<LruCache>> with proper concurrent cache libraries
Complete Feature Implementation
- Either implement the missing PinnedStorageCache or update the PR description
- Add integration tests for the caching layer

Optimize Benchmark Setup

// Instead of recreating MockDatabase each iteration:
impl MockDatabase {
    fn add_storage(&mut self, contract: Address, slot_count: usize) { /* ... */ }
}

📊 Performance Impact Assessment

Positive: Bytecode caching will significantly improve performance for frequently-called contracts
Negative: Current LRU implementation will cause performance degradation under concurrent access
Recommendation: Fix concurrency issues before merging to avoid production performance problems

The caching concept is excellent and will provide substantial performance benefits, but the current implementation has critical concurrency flaws that must be addressed.

gemini-code-assist · 2025-12-08T10:57:27Z

Summary of Changes

Hello @tac0turtle, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the EVM execution environment by integrating sophisticated caching mechanisms. It introduces a BytecodeCache to store frequently accessed contract bytecode, utilizing an LRU strategy for efficient memory management. Additionally, a PinnedStorageCache allows specific "hot" contract storage slots to be permanently resident in RAM, drastically reducing lookup times for critical contracts. These caches are seamlessly integrated through a CachedDatabase wrapper, which is now used by the EvolvePayloadBuilder. The changes also include new benchmarks to quantify the performance improvements and enable configuration of pinned contracts directly from the chainspec.

Highlights

EVM Caching Layers: Introduced BytecodeCache for immutable contract bytecode and PinnedStorageCache for critical contract storage, significantly improving EVM database interaction performance.
CachedDatabase Wrapper: A new CachedDatabase struct integrates these caching mechanisms, providing a unified interface for underlying database operations.
Configurable Pinned Contracts: The EvolvePayloadBuilder now supports configuring "hot" contracts whose storage should be RAM-pinned via the chainspec, ensuring rapid access to frequently used contract states.
Performance Benchmarks: Added comprehensive benchmarks (cache_benchmark.rs) to crates/ev-revm to measure and validate the performance benefits of the new caching strategies.
LRU Eviction for Bytecode: The BytecodeCache implements an LRU (Least Recently Used) eviction policy to manage its capacity efficiently.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces two caching layers to improve EVM database performance: a BytecodeCache for immutable contract bytecode and a PinnedStorageCache for frequently accessed contract storage. The changes include the cache implementations, benchmarks to measure their impact, and integration into the node's payload builder.

My review focuses on the caching implementation in crates/ev-revm/src/cache.rs. I've identified two significant performance issues in the custom LruCache implementation:

A critical issue where cache reads take a write lock, which will serialize all cache access and create a major concurrency bottleneck.
A high severity issue with the LRU eviction logic, which has O(N log N) complexity and will cause latency spikes.

I've recommended replacing the custom implementation with a specialized, production-ready caching crate like moka or lru to address these problems. I also have a medium severity comment on improving the code clarity in the new benchmark setup. The rest of the integration and configuration changes look good.

gemini-code-assist · 2025-12-08T10:59:44Z

crates/ev-revm/src/cache.rs

+    ///
+    /// Returns `None` if the bytecode is not cached.
+    pub fn get(&self, code_hash: &B256) -> Option<Bytecode> {
+        let mut cache = self.cache.write().expect("cache lock poisoned");


The get method acquires a write lock on the cache by calling self.cache.write(). This will become a major performance bottleneck under concurrent access, as it serializes all cache reads. The purpose of using an RwLock is to allow concurrent reads, but this implementation prevents that.

The write lock is necessary here because LruCache::get takes &mut self to update the access counter.

To fix this, I strongly recommend replacing the custom LruCache with a specialized concurrent LRU cache crate like moka or lru. These are designed for high-throughput, thread-safe caching and would solve this issue efficiently.

gemini-code-assist · 2025-12-08T10:59:44Z

crates/ev-revm/src/cache.rs

+        // Collect entries sorted by access order (oldest first)
+        let mut entries: Vec<_> = self.entries.iter().map(|(k, (_, o))| (*k, *o)).collect();
+        entries.sort_by_key(|(_, order)| *order);


The current LRU eviction strategy has a significant performance issue. It collects all cache entries into a vector and sorts them on every eviction. This is an O(N log N) operation, where N is the number of items in the cache. For a large cache (e.g., the default of 10,000 entries), this will cause noticeable latency spikes whenever the cache becomes full.

A more efficient LRU implementation would use a data structure like a doubly-linked list in conjunction with the HashMap to track usage order, allowing for O(1) eviction.

However, the best approach would be to use a well-vetted crate like lru or moka, which provide efficient and correct LRU cache implementations. This would also resolve the locking issue in BytecodeCache::get.

gemini-code-assist · 2025-12-08T10:59:44Z

crates/ev-revm/benches/cache_benchmark.rs

+    for contract in &all_contracts {
+        mock_db = MockDatabase {
+            bytecodes: mock_db.bytecodes,
+            storage: {
+                let mut s = mock_db.storage;
+                for i in 0..slots_per_contract {
+                    s.insert((*contract, U256::from(i)), U256::from(i * 1000));
+                }
+                s
+            },
+            latency_factor,
+        };
+    }


This loop for populating the mock database storage is overly complex and inefficient. It reconstructs the MockDatabase struct on every iteration, which is hard to read and involves unnecessary moves of the bytecodes and storage hashmaps.

A cleaner approach would be to modify the with_storage method to take &mut self instead of self (perhaps renaming it to add_storage). This would allow you to simplify this block to a much more readable loop, for example:

for contract in &all_contracts { mock_db.add_storage(*contract, slots_per_contract); }

bytecode and ram pinning

b3bf07f

tac0turtle changed the title ~~feat: bytecode and ram pinning~~ feat: bytecode and contract pinning Dec 8, 2025

gemini-code-assist bot reviewed Dec 8, 2025

View reviewed changes

tac0turtle added 2 commits December 8, 2025 14:02

remove pining

e01231d

lint

73a30d2

tac0turtle changed the title ~~feat: bytecode and contract pinning~~ feat: bytecode caching Dec 8, 2025

cleanup

95f50c8

tac0turtle closed this Dec 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: bytecode caching #82

feat: bytecode caching #82

Uh oh!

tac0turtle commented Dec 8, 2025

Uh oh!

claude bot commented Dec 8, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Dec 8, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 8, 2025

Uh oh!

gemini-code-assist bot Dec 8, 2025

Uh oh!

gemini-code-assist bot Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: bytecode caching #82

feat: bytecode caching #82

Uh oh!

Conversation

tac0turtle commented Dec 8, 2025

Description

Type of Change

Related Issues

Checklist

Testing

Additional Notes

Uh oh!

claude bot commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review for Bytecode Caching PR

🔴 Critical Performance Issues

⚠️ High Priority Issues

🟡 Medium Priority Issues

✅ Positive Aspects

🔧 Recommendations

📊 Performance Impact Assessment

Uh oh!

gemini-code-assist bot commented Dec 8, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

claude bot commented Dec 8, 2025 •

edited

Loading