feat: add timeout for http and db requests#15
Conversation
📝 WalkthroughWalkthroughAdds request-level and DB-level timeouts and switches block listing to keyset pagination: tower-http gains the timeout feature and a 10s TimeoutLayer is applied; PgPool sets statement_timeout on connect; list_blocks uses a cursor-based WHERE ... ORDER BY ... LIMIT query. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Router as Axum Router (TimeoutLayer)
participant Handler as Block Handler
participant DB as Postgres Pool
Client->>Router: HTTP request
Router->>Handler: Forward (enforces 10s timeout)
Handler->>DB: SELECT ... WHERE number <= $cursor LIMIT $limit
DB-->>Handler: Rows
Handler-->>Client: HTTP response
sequenceDiagram
participant Service as App/Indexer
participant Migrate as run_migrations (creates migration pool)
participant DB as Postgres
Service->>Migrate: run_migrations(database_url)
Migrate->>DB: connect (pool with max_connections=1)
Migrate->>DB: run migrations
DB-->>Migrate: migration result
Migrate-->>Service: done
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
backend/crates/atlas-api/src/main.rs (1)
8-11: Align timeout responses with the standard API error envelope.All handlers return errors through
ApiResult<Json<T>>, which serializes viaApiError::into_response()to produce a consistent JSON envelope:{"error": "message"}. However,TimeoutLayerat lines 214-217 returns a bare 408 status code without a response body, creating inconsistency for clients expecting the standard error format.Consider wrapping with a custom error handler layer to serialize timeouts into the same envelope as other API errors.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/crates/atlas-api/src/main.rs` around lines 8 - 11, TimeoutLayer currently returns a bare 408 response that bypasses our ApiResult/ApiError JSON envelope; modify the timeout handling so timeouts are converted into our standard ApiError and serialized via ApiError::into_response (or wrap TimeoutLayer with a custom layer/handler) so controllers still return ApiResult<Json<T>> style errors. Locate where TimeoutLayer is added (the tower layer setup using TimeoutLayer) and replace or wrap it with a layer that intercepts timeout errors, maps them to an ApiError variant (e.g., ApiError::timeout or ApiError::new with message "request timed out"), and produces the same JSON envelope by calling ApiError::into_response before returning the response. Ensure the handler uses the existing ApiError type and serialization path so clients always receive {"error": "..."} for timeouts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/crates/atlas-api/src/handlers/blocks.rs`:
- Around line 22-36: The cursor calculation for keyset pagination uses
Pagination::offset() which multiplies by the raw, unclamped limit causing
skipped/duplicated pages; fix by using the clamped limit when computing the
cursor (e.g. replace cursor = (total_count - 1) - pagination.offset() with a
calculation that uses pagination.limit(), or change Pagination::offset() to
compute (page.saturating_sub(1)) * self.limit.min(100)); update the cursor
assignment in the handler that computes `cursor` (and/or the Pagination::offset
method) so both use the same clamped limit to keep pages consistent with the SQL
LIMIT binding.
In `@backend/crates/atlas-common/src/db.rs`:
- Around line 4-15: The pool created by create_pool uses after_connect to set a
10s statement_timeout which can cause run_migrations to fail on long
index/insert operations; change the implementation so migrations run with a
pool/connection that does not set the 10s timeout or make the timeout
configurable: add an optional parameter (e.g., statement_timeout:
Option<Duration> or a bool like set_statement_timeout) to create_pool and only
install the after_connect timeout when provided, or add a new helper
create_migration_pool that omits the after_connect handler, and update
run_migrations to use the migration-specific pool/connection instead of the
timeout-bearing pool.
---
Nitpick comments:
In `@backend/crates/atlas-api/src/main.rs`:
- Around line 8-11: TimeoutLayer currently returns a bare 408 response that
bypasses our ApiResult/ApiError JSON envelope; modify the timeout handling so
timeouts are converted into our standard ApiError and serialized via
ApiError::into_response (or wrap TimeoutLayer with a custom layer/handler) so
controllers still return ApiResult<Json<T>> style errors. Locate where
TimeoutLayer is added (the tower layer setup using TimeoutLayer) and replace or
wrap it with a layer that intercepts timeout errors, maps them to an ApiError
variant (e.g., ApiError::timeout or ApiError::new with message "request timed
out"), and produces the same JSON envelope by calling ApiError::into_response
before returning the response. Ensure the handler uses the existing ApiError
type and serialization path so clients always receive {"error": "..."} for
timeouts.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
backend/Cargo.tomlbackend/crates/atlas-api/src/handlers/blocks.rsbackend/crates/atlas-api/src/main.rsbackend/crates/atlas-common/src/db.rs
There was a problem hiding this comment.
♻️ Duplicate comments (1)
backend/crates/atlas-api/src/handlers/blocks.rs (1)
22-26:⚠️ Potential issue | 🟡 MinorThe past critical cursor-limit mismatch is resolved; residual metadata inconsistency at line 43 remains.
The cursor now correctly derives from
pagination.limit()(clamped to 100), which matches the SQLLIMIT $1binding and closes the previous critical bug. However,PaginatedResponse::newat line 43 still receivespagination.limit(raw, unclamped). If a client supplieslimit=200, the response advertisestotal_pages = ceil(total_count / 200), but each page is actually served with 100 rows — causing the client to believe only half the pages exist.🛠️ Proposed fix
Pass the effective (clamped) limit to the response constructor so page-count metadata matches query behaviour:
Ok(Json(PaginatedResponse::new( blocks, pagination.page, - pagination.limit, + limit as u64, // use the same clamped limit that drove the query total_count, )))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/crates/atlas-api/src/handlers/blocks.rs` around lines 22 - 26, PaginatedResponse::new is being given the raw, unclamped pagination.limit while the query uses the clamped value returned by pagination.limit(); update the call to PaginatedResponse::new to pass the effective clamped limit variable (limit) instead of the raw pagination.limit so that total_pages and other metadata reflect the actual number of rows returned; locate where PaginatedResponse::new is invoked in this handler and replace the pagination.limit argument with the local limit variable (and ensure the cursor calculation using limit and cursor remains unchanged).
🧹 Nitpick comments (2)
backend/crates/atlas-common/src/db.rs (2)
22-31: The past timeout-blocking-migrations issue is now properly addressed.The dedicated no-timeout pool with
max_connections = 1correctly isolates migration execution from thestatement_timeoutapplied to the main pool. The design is sound.One note: the migration
poolis dropped implicitly at the end of the function. In sqlx, dropping aPgPoolwithout callingpool.close().awaitmeans in-flight connections are abandoned rather than gracefully closed. For a short-lived single-connection migration pool this is low-risk, but an explicit close is cleaner.♻️ Suggested explicit pool close
pub async fn run_migrations(database_url: &str) -> Result<(), sqlx::Error> { let pool = PgPoolOptions::new() .max_connections(1) .connect(database_url) .await?; sqlx::migrate!("../../migrations") .run(&pool) .await .map_err(|e| sqlx::Error::Migrate(Box::new(e)))?; + pool.close().await; Ok(()) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/crates/atlas-common/src/db.rs` around lines 22 - 31, The run_migrations function currently relies on implicit drop of the migration PgPool (variable pool) which can abandon in-flight connections; after sqlx::migrate!(...).run(&pool).await completes, explicitly call pool.close().await to gracefully close the pool before returning from run_migrations (keep the existing error mapping and Result signature and leave sqlx::migrate! usage unchanged).
12-12: Makestatement_timeoutconfigurable for different environments.The hardcoded
'10s'timeout is appropriate for most cases, but environments with slower storage or heavier concurrent load may need to tune this. Consider adding astatement_timeoutparameter (defaulting to 10s for backward compatibility), which aligns with the codebase's existing pattern of environment-based configuration for db parameters.♻️ Suggested configurable timeout
-pub async fn create_pool(database_url: &str, max_connections: u32) -> Result<PgPool, sqlx::Error> { +pub async fn create_pool( + database_url: &str, + max_connections: u32, + statement_timeout: Option<std::time::Duration>, +) -> Result<PgPool, sqlx::Error> { PgPoolOptions::new() .max_connections(max_connections) .after_connect(|conn, _meta| { Box::pin(async move { - conn.execute("SET statement_timeout = '10s'").await?; + if let Some(timeout) = statement_timeout { + let ms = timeout.as_millis(); + conn.execute( + format!("SET statement_timeout = '{ms}ms'").as_str() + ).await?; + } Ok(()) }) }) .connect(database_url) .await }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/crates/atlas-common/src/db.rs` at line 12, Replace the hardcoded conn.execute("SET statement_timeout = '10s'").await? with a configurable value: read a new statement_timeout setting (default "10s") from the existing DB/config env pattern and pass that value into the SET statement using the same conn.execute call (e.g., format!("SET statement_timeout = '{}'", statement_timeout)). Add the new setting to the DB config struct or function signature used when creating the connection (keeping the default for backward compatibility) and ensure parsing/validation of the value follows the project's config conventions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@backend/crates/atlas-api/src/handlers/blocks.rs`:
- Around line 22-26: PaginatedResponse::new is being given the raw, unclamped
pagination.limit while the query uses the clamped value returned by
pagination.limit(); update the call to PaginatedResponse::new to pass the
effective clamped limit variable (limit) instead of the raw pagination.limit so
that total_pages and other metadata reflect the actual number of rows returned;
locate where PaginatedResponse::new is invoked in this handler and replace the
pagination.limit argument with the local limit variable (and ensure the cursor
calculation using limit and cursor remains unchanged).
---
Nitpick comments:
In `@backend/crates/atlas-common/src/db.rs`:
- Around line 22-31: The run_migrations function currently relies on implicit
drop of the migration PgPool (variable pool) which can abandon in-flight
connections; after sqlx::migrate!(...).run(&pool).await completes, explicitly
call pool.close().await to gracefully close the pool before returning from
run_migrations (keep the existing error mapping and Result signature and leave
sqlx::migrate! usage unchanged).
- Line 12: Replace the hardcoded conn.execute("SET statement_timeout =
'10s'").await? with a configurable value: read a new statement_timeout setting
(default "10s") from the existing DB/config env pattern and pass that value into
the SET statement using the same conn.execute call (e.g., format!("SET
statement_timeout = '{}'", statement_timeout)). Add the new setting to the DB
config struct or function signature used when creating the connection (keeping
the default for backward compatibility) and ensure parsing/validation of the
value follows the project's config conventions.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
backend/crates/atlas-api/src/handlers/blocks.rsbackend/crates/atlas-api/src/main.rsbackend/crates/atlas-common/src/db.rsbackend/crates/atlas-indexer/src/main.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/crates/atlas-api/src/main.rs
Overview
Summary by CodeRabbit