Pathling currently implements the Data Warehouse (recipient) side of the Argonaut $bulk-submit specification. This issue tracks implementing the Data Provider (submitter) side — enabling Pathling to push its data to a remote server's $bulk-submit endpoint.
This would complement the existing $export operation by allowing Pathling to not only export data on demand, but actively submit it to a configured Data Recipient.
Scope
The submitter implementation should handle the full submission lifecycle:
- Export data — Produce bulk export manifests (NDJSON) from Pathling's data warehouse, leveraging the existing
$export infrastructure.
- Submit manifests — Send one or more
$bulk-submit requests with submissionStatus: in-progress and manifestUrl to the Data Recipient.
- Mark complete — Send a
$bulk-submit request with submissionStatus: complete once all manifests have been submitted.
- Poll status — Use
$bulk-submit-status to monitor processing progress and retrieve results (including errors).
- Handle errors — Parse the status manifest error section and surface issues appropriately.
Additional considerations
- Authentication — Support OAuth 2.0 client credentials for authenticating with the Data Recipient's
$bulk-submit endpoint (both symmetric and asymmetric/JWT).
- Manifest hosting — The submitter needs to make its exported files available at URLs the Data Recipient can fetch. This may involve serving files via an HTTP endpoint or uploading to object storage.
- Retry and resilience — Handle transient failures, HTTP 429 rate limiting (with
Retry-After), and network errors during submission and polling.
- Abort — Support aborting an in-progress submission (
submissionStatus: aborted).
- Replace — Support the
replacesManifestUrl parameter for correcting previously submitted manifests.
- Configuration — Define configuration for target Data Recipient endpoints, submitter identity, and OAuth credentials.
Pathling currently implements the Data Warehouse (recipient) side of the Argonaut $bulk-submit specification. This issue tracks implementing the Data Provider (submitter) side — enabling Pathling to push its data to a remote server's
$bulk-submitendpoint.This would complement the existing
$exportoperation by allowing Pathling to not only export data on demand, but actively submit it to a configured Data Recipient.Scope
The submitter implementation should handle the full submission lifecycle:
$exportinfrastructure.$bulk-submitrequests withsubmissionStatus: in-progressandmanifestUrlto the Data Recipient.$bulk-submitrequest withsubmissionStatus: completeonce all manifests have been submitted.$bulk-submit-statusto monitor processing progress and retrieve results (including errors).Additional considerations
$bulk-submitendpoint (both symmetric and asymmetric/JWT).Retry-After), and network errors during submission and polling.submissionStatus: aborted).replacesManifestUrlparameter for correcting previously submitted manifests.