feat: add retry handling for transient sync failures#62
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3be518676b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| suggéré. Ne lève jamais, ne touche pas l'état de sync. | ||
| """ | ||
| client = self.client | ||
| attempts = list(getattr(client, "op_attempts", []) or []) |
There was a problem hiding this comment.
Reset retry diagnostics before building each sync report
_collect_reliability_diagnostics() reads client.op_attempts wholesale, but that buffer is never cleared at the start of sync(). If the same NexaNoteSyncEngine instance runs multiple syncs, transient failures from an earlier run remain in op_attempts, so a later non-transient failure (for example, a 401 auth error) can be incorrectly reported as retryable=true with a stale transient reason. This can drive incorrect client retry behavior and misleading logs for long-lived engine instances.
Useful? React with 👍 / 👎.
Improves NexaNote sync reliability on unstable networks.
Adds:
This keeps the existing sync engine and file-based storage layout intact.
Details
WebDAVClient._executechoke point wrapping every network op (GET, PROPFIND, PUT, MKCOL). Defaults: 3 attempts with 0.5s/1s/2s backoff. Retries only transient conditions — timeouts, connection errors, and HTTP 429/502/503/504. Auth (401/403) and 404 are never retried.SyncConfig.timeout_seconds; the retry budget is configurable viaSyncConfig.max_attempts/backoff_seconds.retryable: true, a sanitizedtransient_reason, and a suggestednext_retry_after_seconds. Surfaced additively onPOST /sync/triggerand written to the sync log. Sync state is never corrupted.Test plan
transient 503 succeeds after retry(PUT and GET)401/404are not retried (single attempt)timeout/connection error retried then reportedretryableLOCAL_ONLY)Note: developed on branch
claude/dreamy-brown-9HfXuper the environment's branch requirement (task suggestedfeat/sync-network-reliability).Generated by Claude Code