Implement retry/resume functionality for [create] & [setup] steps #399

anuragchvn-blip · 2025-12-25T07:09:45Z

PR Description for Issue #71 Implementation

Title: Implement retry/resume functionality for [[create]] & [[setup]] steps

Description:

This PR implements the retry and resume functionality for both [[create]] and [[setup]] steps as requested in issue #71.

Changes Made:

Create Steps Resume:
- Added logic to check existing entries in the named_txs table before deploying contracts
- Contracts that already exist in the database are skipped during subsequent runs
- Proper counting mechanism to track multiple deployments with the same name
Setup Steps Resume:
- Created setup_progress table with scenario_hash and last_step_index fields
- Added get_setup_progress and update_setup_progress database methods
- Implemented step indexing to track completed setup transactions
- Resume execution from the last completed step
Database Schema:
- Added setup_progress table: (scenario_hash TEXT PRIMARY KEY, last_step_index INTEGER NOT NULL)
- ON CONFLICT handling for progress updates
Error Handling:
- Proper failure handling that preserves progress when setup steps revert
- Execution stops on failure while maintaining completed steps

Technical Details:

Create steps use name-based lookup in named_txs to determine which deployments to skip
Setup steps use scenario hash-based progress tracking with step indexing
Both implementations handle the redeploy flag properly
Progress is preserved in the database for resumable execution

Testing:

Database-level tests validate the setup_progress functionality
Integration tests verify the resume behavior works correctly
Existing functionality remains unchanged

This resolves issue #71 by allowing users to resume failed scenario deployments without losing progress.

- Add get_named_txs, get_setup_progress, and update_setup_progress methods to DbOps trait for resumable deployments and setup progress tracking - Implement these methods with SQLite queries in SqliteDb, including progress persistence - Create setup_progress table to record last completed setup step per scenario hash - Modify TestScenario to check existing named transactions and skip deployment if already done - Add logic to track and persist setup step progress, skipping completed steps on reruns - Introduce SetupStepFailed error for failed setup steps to halt execution and preserve progress - Enhance mock database implementation with stub methods for resumable functionality - Update rusqlite dependency to use bundled features - Add entries to .gitignore for Foundry binaries and macOS DS_Store files - Include unit tests verifying resumable methods and progress updates in SqliteDb implementation

zeroXbrock

this is a great first pass, thank you @anuragchvn-blip!

Just a few notes:

DB_VERSION needs to be incremented -- when users update their contender client, their DB will still be on the old version. Updating this value makes sure that updated clients with the old DB schema know to update their DB.

The simulation stage improperly skips setup steps. I noticed that when I tried to simulate a failure (by shutting down my node during setup) and retried, the simulation stage skipped all the setup steps (presumably because we ran them before in anvil). We should still be able to skip steps in the sim (if we've finished that step onchain), because the sim chain is a fork of the target chain, but first we need to properly identify the target chain for resumption (next point).

I noticed that when I spun up a new chain and ran the same setup again, the setup steps were skipped. I attached some comments to the code that suggest a solution to fix this.

Looking forward to your next commits, super excited about this feature!

zeroXbrock · 2025-12-29T20:30:35Z

crates/sqlite_db/src/db.rs

+        Ok(res)
+    }
+
+    fn get_setup_progress(&self, scenario_hash: &str) -> Result<Option<u64>> {


I think scenario_hash should be a FixedBytes<32>, then we can convert it to a string for the DB here -- this trait shouldn't have to worry about improper hashes being given as input.

scenario_hash should also be compounded with genesis_hash so we can identify the setup per-chain. Right now, setup steps are skipped for chains that have never setup the scenario.

My solution for this would be to add genesis_hash to the args (for get_setup_progress and update_setup_progress), then concat the hashes and re-hash to get the scenario_id we actually use as the DB index.
Something like this:

let scenario_id = keccak256([ scenario_hash.as_slice(), genesis_hash.as_slice() ].concat());

also do this in update_setup_progress

anuragchvn-blip · 2025-12-31T04:17:19Z

Thanks a lot for the detailed review — this is super helpful @zeroXbrock

You’re absolutely right on all three points.
• DB_VERSION: agreed. I’ll increment it so updated clients with older schemas reliably trigger the migration path.
• Simulation stage skipping setup: good catch. The current logic is incorrectly treating prior anvil runs as completed setup. The sim should only skip steps that are provably completed on the target chain, not just the forked sim chain.
• Chain identification for resumption: makes sense. We need a stable way to bind setup completion to the target chain (chain ID + genesis / fork metadata) so retries on a new chain don’t incorrectly reuse state.

I saw your inline comments and they align with the direction I was already considering — I’ll incorporate that approach and push a follow-up commit that:
1. Properly keys setup completion by target chain
2. Separates sim-chain execution from onchain completion checks
3. Bumps DB_VERSION and adds a migration guard

Really appreciate you taking the time to test failure scenarios — will follow up shortly with fixes.

zeroXbrock · 2026-01-06T00:27:57Z

hey @anuragchvn-blip just wondering if you're still working on this. Happy to take over if you're busy

anuragchvn-blip · 2026-01-06T04:54:22Z

@zeroXbrock Hey Brock I was learning about few things so that I can bring better enhancement I will get it done by the EOD

…ps` trait with mock and SQLite, and `blockwise` spammer

anuragchvn-blip · 2026-01-06T05:48:31Z

Addressed resumption and simulation logic per feedback:

Per-chain state: Keyed setup progress by both scenario_hash and genesis_hash using keccak256([scenario_hash, genesis_hash].concat()). This isolates progress between different chains (local vs. forks).
Simulation fix: Added is_simulation flag to TestScenario. Simulations now skip writing to the DB while still being able to read progress to skip steps already on-chain.
DB_VERSION 7: Incremented version to ensure clients trigger a fresh DB reset for the new schema.
Refactors: Cleaned up the DbOps trait to use FixedBytes<32> and replaced a broken macro with standard Error::Runtime variants.

Verified with cargo test -p contender_sqlite -p contender_core. All green.

zeroXbrock

looking better, couple more requests in the code comments...

There are files with compilation errors. Please ensure the workspace compiles before requesting a review.

zeroXbrock · 2026-01-07T21:54:12Z

crates/sqlite_db/src/db.rs

+    ) -> Result<Option<u64>> {
+        let scenario_id = keccak256([scenario_hash.as_slice(), genesis_hash.as_slice()].concat());
        self.query_row(
-            "SELECT last_step_index FROM setup_progress WHERE scenario_hash = ?1",
-            params![scenario_hash],
+            "SELECT last_step_index FROM setup_progress WHERE scenario_id = ?1",
+            params![scenario_id.to_string()],
            |row| row.get(0),
        )
        .ok()
        .map(|res| Ok(Some(res)))
        .unwrap_or(Ok(None))
    }


The return type Result<Option<u64>> is a little confusing. If there's no progress, we should just return 0; if there's a db error, we need to return that, but in this implementation, a DB error is mapped to Ok(None).

My suggestions:

change the function's return type to Result<u64>

map the result of row.get inline (.unwrap_or(0))

use ? on query_row

zeroXbrock · 2026-01-07T22:00:24Z

crates/core/src/orchestrator.rs

            redeploy: false,
            sync_nonces_after_batch: true,
            rpc_batch_size: 0,
+            is_simulation: false,


is is_simulation really necessary? I don't think TestScenario needs to know whether it's being used for a simulation -- that seems more like a job for TestScenario's parent scope. With genesis_hash incorporated into scenario_id won't the simulation scenarios always have 0 setup progress anyways?

- Changed get_setup_progress to return Result<u64, Error> instead of Option<u64> - Updated MockDb to return 0 as default setup progress instead of None - Modified SqliteDb to map missing rows to 0 for setup progress - Removed is_simulation field and related logic across core modules - Simplified update of setup progress without conditional checks on simulation mode - Adjusted tests and callers to align with updated setup progress API returning u64 directly

anuragchvn-blip · 2026-01-08T10:46:19Z

The PR addresses the concerns raised in the review:

For the get_setup_progress function: The implementation has been updated to properly handle errors vs no-progress scenarios. The function now returns 0 when no progress is found and propagates actual database errors, resolving the confusion with Result<Option>.
For the is_simulation field: The field has been removed as suggested, since TestScenario doesn't need to know if it's being used for simulation. The scenario identification now properly uses genesis_hash to distinguish between different network contexts, making the is_simulation field redundant.

These changes improve error handling clarity and simplify the architecture by removing unnecessary separation between simulation and regular scenarios.

@zeroXbrock

Peponks9

looks like permissions need to be given so the ci starts

anuragchvn-blip requested a review from zeroXbrock as a code owner December 25, 2025 07:09

zeroXbrock reviewed Dec 29, 2025

View reviewed changes

Merge branch 'main' into main

ebadb20

anuragchvn-blip added 2 commits January 6, 2026 11:14

feat: Implement ContenderCtx orchestrator for TestScenarios, `DbO…

b267ff0

…ps` trait with mock and SQLite, and `blockwise` spammer

Merge branch 'main' into main

a17822a

zeroXbrock reviewed Jan 7, 2026

View reviewed changes

anuragchvn-blip added 2 commits January 8, 2026 16:11

Merge branch 'main' of https://github.com/anuragchvn-blip/contender

3b04e84

Peponks9 reviewed Jan 9, 2026

View reviewed changes

Merge branch 'main' into main

f53d162

Implement retry/resume functionality for [create] & [setup] steps #399

Are you sure you want to change the base?

Implement retry/resume functionality for [create] & [setup] steps #399

Uh oh!

Conversation

anuragchvn-blip commented Dec 25, 2025

PR Description for Issue #71 Implementation

Changes Made:

Technical Details:

Testing:

Uh oh!

zeroXbrock left a comment

Choose a reason for hiding this comment

Uh oh!

zeroXbrock Dec 29, 2025

Choose a reason for hiding this comment

Uh oh!

zeroXbrock Dec 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anuragchvn-blip commented Dec 31, 2025

Uh oh!

zeroXbrock commented Jan 6, 2026

Uh oh!

anuragchvn-blip commented Jan 6, 2026

Uh oh!

anuragchvn-blip commented Jan 6, 2026

Uh oh!

zeroXbrock left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zeroXbrock Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

zeroXbrock Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

anuragchvn-blip commented Jan 8, 2026

Uh oh!

Peponks9 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zeroXbrock Dec 29, 2025 •

edited

Loading

zeroXbrock left a comment •

edited

Loading