-
Notifications
You must be signed in to change notification settings - Fork 38
Implement retry/resume functionality for [create] & [setup] steps #399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add get_named_txs, get_setup_progress, and update_setup_progress methods to DbOps trait for resumable deployments and setup progress tracking - Implement these methods with SQLite queries in SqliteDb, including progress persistence - Create setup_progress table to record last completed setup step per scenario hash - Modify TestScenario to check existing named transactions and skip deployment if already done - Add logic to track and persist setup step progress, skipping completed steps on reruns - Introduce SetupStepFailed error for failed setup steps to halt execution and preserve progress - Enhance mock database implementation with stub methods for resumable functionality - Update rusqlite dependency to use bundled features - Add entries to .gitignore for Foundry binaries and macOS DS_Store files - Include unit tests verifying resumable methods and progress updates in SqliteDb implementation
zeroXbrock
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a great first pass, thank you @anuragchvn-blip!
Just a few notes:
DB_VERSION needs to be incremented -- when users update their contender client, their DB will still be on the old version. Updating this value makes sure that updated clients with the old DB schema know to update their DB.
The simulation stage improperly skips setup steps. I noticed that when I tried to simulate a failure (by shutting down my node during setup) and retried, the simulation stage skipped all the setup steps (presumably because we ran them before in anvil). We should still be able to skip steps in the sim (if we've finished that step onchain), because the sim chain is a fork of the target chain, but first we need to properly identify the target chain for resumption (next point).
I noticed that when I spun up a new chain and ran the same setup again, the setup steps were skipped. I attached some comments to the code that suggest a solution to fix this.
Looking forward to your next commits, super excited about this feature!
crates/sqlite_db/src/db.rs
Outdated
| Ok(res) | ||
| } | ||
|
|
||
| fn get_setup_progress(&self, scenario_hash: &str) -> Result<Option<u64>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think scenario_hash should be a FixedBytes<32>, then we can convert it to a string for the DB here -- this trait shouldn't have to worry about improper hashes being given as input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scenario_hash should also be compounded with genesis_hash so we can identify the setup per-chain. Right now, setup steps are skipped for chains that have never setup the scenario.
My solution for this would be to add genesis_hash to the args (for get_setup_progress and update_setup_progress), then concat the hashes and re-hash to get the scenario_id we actually use as the DB index.
Something like this:
let scenario_id = keccak256([
scenario_hash.as_slice(),
genesis_hash.as_slice()
].concat());also do this in update_setup_progress
|
Thanks a lot for the detailed review — this is super helpful @zeroXbrock You’re absolutely right on all three points. I saw your inline comments and they align with the direction I was already considering — I’ll incorporate that approach and push a follow-up commit that: Really appreciate you taking the time to test failure scenarios — will follow up shortly with fixes. |
|
hey @anuragchvn-blip just wondering if you're still working on this. Happy to take over if you're busy |
|
@zeroXbrock Hey Brock I was learning about few things so that I can bring better enhancement I will get it done by the EOD |
…ps` trait with mock and SQLite, and `blockwise` spammer
|
Addressed resumption and simulation logic per feedback:
Verified with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crates/sqlite_db/src/db.rs
Outdated
| ) -> Result<Option<u64>> { | ||
| let scenario_id = keccak256([scenario_hash.as_slice(), genesis_hash.as_slice()].concat()); | ||
| self.query_row( | ||
| "SELECT last_step_index FROM setup_progress WHERE scenario_hash = ?1", | ||
| params![scenario_hash], | ||
| "SELECT last_step_index FROM setup_progress WHERE scenario_id = ?1", | ||
| params![scenario_id.to_string()], | ||
| |row| row.get(0), | ||
| ) | ||
| .ok() | ||
| .map(|res| Ok(Some(res))) | ||
| .unwrap_or(Ok(None)) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return type Result<Option<u64>> is a little confusing. If there's no progress, we should just return 0; if there's a db error, we need to return that, but in this implementation, a DB error is mapped to Ok(None).
My suggestions:
- change the function's return type to
Result<u64> - map the result of
row.getinline (.unwrap_or(0)) - use
?onquery_row
crates/core/src/orchestrator.rs
Outdated
| redeploy: false, | ||
| sync_nonces_after_batch: true, | ||
| rpc_batch_size: 0, | ||
| is_simulation: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is is_simulation really necessary? I don't think TestScenario needs to know whether it's being used for a simulation -- that seems more like a job for TestScenario's parent scope. With genesis_hash incorporated into scenario_id won't the simulation scenarios always have 0 setup progress anyways?
- Changed get_setup_progress to return Result<u64, Error> instead of Option<u64> - Updated MockDb to return 0 as default setup progress instead of None - Modified SqliteDb to map missing rows to 0 for setup progress - Removed is_simulation field and related logic across core modules - Simplified update of setup progress without conditional checks on simulation mode - Adjusted tests and callers to align with updated setup progress API returning u64 directly
|
The PR addresses the concerns raised in the review:
These changes improve error handling clarity and simplify the architecture by removing unnecessary separation between simulation and regular scenarios. |
Peponks9
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like permissions need to be given so the ci starts

PR Description for Issue #71 Implementation
Title: Implement retry/resume functionality for [[create]] & [[setup]] steps
Description:
This PR implements the retry and resume functionality for both
[[create]]and[[setup]]steps as requested in issue #71.Changes Made:
Create Steps Resume:
named_txstable before deploying contractsSetup Steps Resume:
setup_progresstable withscenario_hashandlast_step_indexfieldsget_setup_progressandupdate_setup_progressdatabase methodsDatabase Schema:
setup_progresstable:(scenario_hash TEXT PRIMARY KEY, last_step_index INTEGER NOT NULL)Error Handling:
Technical Details:
named_txsto determine which deployments to skipTesting:
This resolves issue #71 by allowing users to resume failed scenario deployments without losing progress.