Skip to content

Conversation

@tac0turtle
Copy link
Contributor

Description

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Performance improvement
  • Refactoring

Related Issues

Fixes #(issue)

Checklist

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published

Testing

Additional Notes

@claude
Copy link

claude bot commented Dec 8, 2025

Claude finished @tac0turtle's task —— View job


Code Review for Bytecode Caching PR

I'm reviewing this pull request for code quality, potential bugs, performance considerations, security concerns, and test coverage.

Todo List:

  • Read and analyze main caching implementation
  • Review benchmark code
  • Check integration points
  • Assess security implications
  • Evaluate test coverage
  • Provide comprehensive feedback

Critical Issues Found:

🔴 Critical Performance Issues

  1. Severe Concurrency Bottleneck (crates/ev-revm/src/cache.rs:116)

    • The BytecodeCache::get() method acquires a write lock for read operations
    • This completely negates the purpose of using RwLock and serializes all cache access
    • Under high concurrency, this will become a major performance bottleneck
    • Root cause: Custom LruCache::get() requires &mut self to update access counters
  2. Inefficient LRU Eviction (crates/ev-revm/src/cache.rs:74-75)

    • LRU eviction has O(N log N) complexity due to collecting and sorting all entries
    • For the default cache size of 10,000 entries, this will cause significant latency spikes
    • Root cause: Naive implementation that sorts entire cache on every eviction

⚠️ High Priority Issues

  1. Misleading PR Description

    • PR mentions "PinnedStorageCache" but this feature is completely absent from the code
    • Only BytecodeCache is implemented, not the advertised storage pinning functionality
  2. Benchmark Quality Issues (crates/ev-revm/benches/cache_benchmark.rs:146)

    • Inefficient database population loop recreates MockDatabase on every iteration
    • Could be simplified with a proper add_storage method pattern

🟡 Medium Priority Issues

  1. Missing Integration Tests

    • No integration tests specifically for caching behavior in crates/tests/
    • Only unit tests in the cache module itself
    • Should test cache behavior in realistic EVM execution scenarios
  2. Limited Error Handling

    • Cache operations assume locks will never be poisoned
    • Should consider graceful degradation when cache is unavailable

✅ Positive Aspects

  • Good test coverage for the implemented functionality with comprehensive unit tests
  • Well-documented cache implementation with clear documentation
  • Proper integration into the payload builder with configurable cache capacity
  • Realistic benchmarks that test various cache scenarios
  • Security: No security vulnerabilities identified in the caching logic

🔧 Recommendations

  1. Replace Custom LRU Implementation

    // Consider using a production-ready crate like:
    use lru::LruCache; // or moka::sync::Cache
  2. Fix Concurrency Issues

    • Use a lock-free or reader-writer optimized cache implementation
    • Consider Arc<Mutex<LruCache>> with proper concurrent cache libraries
  3. Complete Feature Implementation

    • Either implement the missing PinnedStorageCache or update the PR description
    • Add integration tests for the caching layer
  4. Optimize Benchmark Setup

    // Instead of recreating MockDatabase each iteration:
    impl MockDatabase {
        fn add_storage(&mut self, contract: Address, slot_count: usize) { /* ... */ }
    }

📊 Performance Impact Assessment

  • Positive: Bytecode caching will significantly improve performance for frequently-called contracts
  • Negative: Current LRU implementation will cause performance degradation under concurrent access
  • Recommendation: Fix concurrency issues before merging to avoid production performance problems

The caching concept is excellent and will provide substantial performance benefits, but the current implementation has critical concurrency flaws that must be addressed.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @tac0turtle, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the EVM execution environment by integrating sophisticated caching mechanisms. It introduces a BytecodeCache to store frequently accessed contract bytecode, utilizing an LRU strategy for efficient memory management. Additionally, a PinnedStorageCache allows specific "hot" contract storage slots to be permanently resident in RAM, drastically reducing lookup times for critical contracts. These caches are seamlessly integrated through a CachedDatabase wrapper, which is now used by the EvolvePayloadBuilder. The changes also include new benchmarks to quantify the performance improvements and enable configuration of pinned contracts directly from the chainspec.

Highlights

  • EVM Caching Layers: Introduced BytecodeCache for immutable contract bytecode and PinnedStorageCache for critical contract storage, significantly improving EVM database interaction performance.
  • CachedDatabase Wrapper: A new CachedDatabase struct integrates these caching mechanisms, providing a unified interface for underlying database operations.
  • Configurable Pinned Contracts: The EvolvePayloadBuilder now supports configuring "hot" contracts whose storage should be RAM-pinned via the chainspec, ensuring rapid access to frequently used contract states.
  • Performance Benchmarks: Added comprehensive benchmarks (cache_benchmark.rs) to crates/ev-revm to measure and validate the performance benefits of the new caching strategies.
  • LRU Eviction for Bytecode: The BytecodeCache implements an LRU (Least Recently Used) eviction policy to manage its capacity efficiently.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@tac0turtle tac0turtle changed the title feat: bytecode and ram pinning feat: bytecode and contract pinning Dec 8, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces two caching layers to improve EVM database performance: a BytecodeCache for immutable contract bytecode and a PinnedStorageCache for frequently accessed contract storage. The changes include the cache implementations, benchmarks to measure their impact, and integration into the node's payload builder.

My review focuses on the caching implementation in crates/ev-revm/src/cache.rs. I've identified two significant performance issues in the custom LruCache implementation:

  1. A critical issue where cache reads take a write lock, which will serialize all cache access and create a major concurrency bottleneck.
  2. A high severity issue with the LRU eviction logic, which has O(N log N) complexity and will cause latency spikes.

I've recommended replacing the custom implementation with a specialized, production-ready caching crate like moka or lru to address these problems. I also have a medium severity comment on improving the code clarity in the new benchmark setup. The rest of the integration and configuration changes look good.

///
/// Returns `None` if the bytecode is not cached.
pub fn get(&self, code_hash: &B256) -> Option<Bytecode> {
let mut cache = self.cache.write().expect("cache lock poisoned");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The get method acquires a write lock on the cache by calling self.cache.write(). This will become a major performance bottleneck under concurrent access, as it serializes all cache reads. The purpose of using an RwLock is to allow concurrent reads, but this implementation prevents that.

The write lock is necessary here because LruCache::get takes &mut self to update the access counter.

To fix this, I strongly recommend replacing the custom LruCache with a specialized concurrent LRU cache crate like moka or lru. These are designed for high-throughput, thread-safe caching and would solve this issue efficiently.

Comment on lines +73 to +75
// Collect entries sorted by access order (oldest first)
let mut entries: Vec<_> = self.entries.iter().map(|(k, (_, o))| (*k, *o)).collect();
entries.sort_by_key(|(_, order)| *order);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current LRU eviction strategy has a significant performance issue. It collects all cache entries into a vector and sorts them on every eviction. This is an O(N log N) operation, where N is the number of items in the cache. For a large cache (e.g., the default of 10,000 entries), this will cause noticeable latency spikes whenever the cache becomes full.

A more efficient LRU implementation would use a data structure like a doubly-linked list in conjunction with the HashMap to track usage order, allowing for O(1) eviction.

However, the best approach would be to use a well-vetted crate like lru or moka, which provide efficient and correct LRU cache implementations. This would also resolve the locking issue in BytecodeCache::get.

Comment on lines 274 to 286
for contract in &all_contracts {
mock_db = MockDatabase {
bytecodes: mock_db.bytecodes,
storage: {
let mut s = mock_db.storage;
for i in 0..slots_per_contract {
s.insert((*contract, U256::from(i)), U256::from(i * 1000));
}
s
},
latency_factor,
};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This loop for populating the mock database storage is overly complex and inefficient. It reconstructs the MockDatabase struct on every iteration, which is hard to read and involves unnecessary moves of the bytecodes and storage hashmaps.

A cleaner approach would be to modify the with_storage method to take &mut self instead of self (perhaps renaming it to add_storage). This would allow you to simplify this block to a much more readable loop, for example:

for contract in &all_contracts {
    mock_db.add_storage(*contract, slots_per_contract);
}

@tac0turtle tac0turtle changed the title feat: bytecode and contract pinning feat: bytecode caching Dec 8, 2025
@tac0turtle tac0turtle closed this Dec 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants