Skip to content

Evaluator /verify transient retry backoff (2, 5, 10 seconds) adds up to 17s before ERROR #32

@u2135

Description

@u2135

Summary

On unhappy paths, POST .../queries/{id}/verify may return transient 429/5xx. The SDK retries with fixed sleeps 2s + 5s + 10s = 17s total before surfacing server failure as ERROR.

Current behavior (SDK)

  • src/provably/handoff/evaluator.py:
    • _VERIFY_RETRY_BACKOFF = (2.0, 5.0, 10.0)
    • Transient statuses include 429, 500, 502, 503, 504

Impact

  • Correct for resilience, but poor UX when the service is degraded (long tail to ERROR).
  • Developer feedback: consider optimizing unhappy/healing paths.

Acceptance criteria

  • Product decision: keep conservative backoff vs exponential backoff with jitter vs shorter caps for eval-only path.
  • Document worst-case latency for evaluation in README or runbooks.
  • Optional: env-tunable backoff for operators.

References

  • src/provably/handoff/evaluator.py_verify_query_record, _VERIFY_RETRY_BACKOFF

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions