veryfi is a Python SDK for communicating with the Veryfi OCR API.
Extract structured data from receipts, invoices, bank statements, checks, W-2s, W-8s, W-9s, business cards, and more — with a single function call.
Full API reference: veryfi.github.io/veryfi-python
Veryfi API docs: docs.veryfi.com
- Installation
- Getting Started
- Supported APIs
- Error Handling
- Command-line interface
- Contributing
- Need Help?
- Changelog
- License
Install from PyPI using pip:
pip install -U veryfiRequires Python 3.9 or later.
If you don't have a Veryfi account, register at app.veryfi.com/signup/api/.
from veryfi import Client
client = Client(
client_id="your_client_id",
client_secret="your_client_secret",
username="your_username",
api_key="your_api_key",
)Optional constructor parameters:
| Parameter | Default | Description |
|---|---|---|
base_url |
https://api.veryfi.com/api/ |
Override the API base URL |
api_version |
v8 |
API version string |
timeout |
30 |
Request timeout in seconds |
Process a receipt or invoice from a local file:
response = client.process_document(
file_path="/tmp/receipt.jpg",
categories=["Meals & Entertainment", "Travel"],
)Process from a URL:
response = client.process_document_url(
file_url="https://cdn.example.com/invoice.pdf",
categories=["Office Supplies"],
boost_mode=True,
external_id="my-ref-001",
max_pages_to_process=5,
)The response contains the extracted fields. A typical result looks like:
{
"id": 933760836,
"created_date": "2024-08-15 15:56:56",
"date": "2022-05-24 13:10:00",
"vendor": {"name": "Walgreens", "address": "191 E 3rd Ave, San Mateo, CA 94401, US"},
"total": 29.53,
"subtotal": 27.60,
"tax": 1.93,
"currency_code": "USD",
"category": "Personal Care",
"payment": {"type": "visa", "card_number": "1850", "display_name": "Visa ***1850"},
"line_items": [
{"description": "RED BULL ENRGY DRNK CNS 8.4OZ 6PK", "total": 8.79, "quantity": 1.0},
{"description": "COCA COLA MINICAN 7.5Z 6PK", "total": 4.99, "quantity": 1.0},
# ...
],
"status": "processed",
}Other document operations:
# List / search documents
documents = client.get_documents(q="Walgreens", created_date__gt="2024-01-01+00:00:00")
# Get a single document by ID
document = client.get_document(document_id=933760836)
# Update fields on a document
client.update_document(
document_id=933760836,
vendor={"name": "Starbucks", "address": "123 Easy St, San Francisco, CA 94158"},
category="Meals & Entertainment",
total=11.23,
)
# Delete a document
client.delete_document(document_id=933760836)items = client.get_line_items(document_id=933760836)
client.add_line_item(document_id=933760836, payload={"description": "Extra item", "total": 5.00})
client.update_line_item(document_id=933760836, line_item_id=101, payload={"total": 6.00})
client.delete_line_item(document_id=933760836, line_item_id=101)client.add_tag(document_id=933760836, tag_name="reimbursable")
client.add_tags(document_id=933760836, tags=["q1", "travel"])
client.get_tags(document_id=933760836)
client.delete_tags(document_id=933760836)response = client.split_and_process_pdf(file_path="/tmp/multi.pdf")
response = client.split_and_process_pdf_url(file_url="https://cdn.example.com/multi.pdf")Process a bank statement and extract transactions, balances, and account details:
# From a local file
response = client.process_bank_statement_document(
file_path="/tmp/statement.pdf",
categories=["Transfer", "Credit Card Payments", "Restaurants / Dining / Meals"],
)
# From a URL
response = client.process_bank_statement_document_url(
file_url="https://cdn.example.com/statement.pdf",
categories=["ATM Deposit", "Interest / Dividends", "Mortgage Payments"],
)The categories parameter is an optional list of strings used to classify transactions. When provided, the API maps each transaction to the closest matching category.
# List statements
statements = client.get_bank_statements(
created_date__gt="2024-01-01+00:00:00",
created_date__lte="2024-12-31+23:59:59",
)
# Get a single statement
statement = client.get_bank_statement(document_id=4559568)
# Delete
client.delete_bank_statement(document_id=4559568)# Process from file
response = client.process_check(file_path="/tmp/check.jpg")
# Process from URL
response = client.process_check_url(file_url="https://cdn.example.com/check.jpg")
# Check with remittance
response = client.process_check_with_remittance(file_path="/tmp/check_remittance.pdf")
response = client.process_check_with_remittance_url(file_url="https://cdn.example.com/check.pdf")
# List, get, update, delete
checks = client.get_checks(created_date__gt="2024-01-01+00:00:00")
check = client.get_check(document_id=12345)
client.update_check(document_id=12345, status="cleared")
client.delete_check(document_id=12345)response = client.process_bussines_card_document(file_path="/tmp/card.jpg")
response = client.process_bussines_card_document_url(file_url="https://cdn.example.com/card.jpg")
cards = client.get_business_cards()
card = client.get_business_card(document_id=67890)
client.delete_business_card(document_id=67890)response = client.process_w2_document(file_path="/tmp/w2.pdf")
response = client.process_w2_document_url(file_url="https://cdn.example.com/w2.pdf")
w2s = client.get_w2s(created_date_gt="2024-01-01+00:00:00")
w2 = client.get_w2(document_id=11111)
client.delete_w2(document_id=11111)
# Split & process a multi-W-2 PDF
response = client.split_and_process_w2(file_path="/tmp/multi_w2.pdf")
response = client.split_and_process_w2_url(file_url="https://cdn.example.com/multi_w2.pdf")response = client.process_w8_document(file_path="/tmp/w8.pdf")
response = client.process_w8_document_url(file_url="https://cdn.example.com/w8.pdf")
w8s = client.get_w8s()
w8 = client.get_w8(document_id=22222)
client.delete_w8(document_id=22222)response = client.process_w9_document(file_path="/tmp/w9.pdf")
response = client.process_w9_document_url(file_url="https://cdn.example.com/w9.pdf")
w9s = client.get_w9s()
w9 = client.get_w9(document_id=33333)
client.delete_w9(document_id=33333)Use a custom blueprint to extract fields from any document type:
response = client.process_any_document(
blueprint_name="my_custom_blueprint",
file_path="/tmp/custom_doc.pdf",
)
response = client.process_any_document_url(
blueprint_name="my_custom_blueprint",
file_url="https://cdn.example.com/custom_doc.pdf",
)
docs = client.get_any_documents(created_date__gt="2024-01-01+00:00:00")
doc = client.get_any_document(document_id=44444)
client.delete_any_document(document_id=44444)Classify a document to determine its type before processing:
response = client.classify_document(
file_path="/tmp/unknown.pdf",
document_types=["receipt", "invoice", "bank_statement"],
)
response = client.classify_document_url(
file_url="https://cdn.example.com/unknown.pdf",
document_types=["w2", "w9"],
)All API errors raise a VeryfiClientError (or a more specific subclass). Import the exceptions you need:
from veryfi.errors import (
VeryfiClientError,
UnauthorizedAccessToken,
BadRequest,
ResourceNotFound,
AccessLimitReached,
)
try:
response = client.process_document(file_path="/tmp/receipt.jpg")
except UnauthorizedAccessToken:
print("Check your client_id, username, and api_key.")
except ResourceNotFound:
print("The requested document does not exist.")
except AccessLimitReached:
print("API rate limit reached. Please wait before retrying.")
except BadRequest as e:
print(f"Bad request: {e}")
except VeryfiClientError as e:
print(f"Unexpected error (HTTP {e.status}): {e}")| Exception | HTTP status | Cause |
|---|---|---|
UnauthorizedAccessToken |
401 | Invalid or missing credentials |
BadRequest |
400 | Malformed request or missing required fields |
ResourceNotFound |
404 | Document ID does not exist |
UnexpectedHTTPMethod |
405 | Wrong HTTP method used |
AccessLimitReached |
409 | Rate limit exceeded |
InternalError |
500 | Server-side error |
ServiceUnavailable |
503 | Veryfi service is temporarily down |
Installing veryfi also installs a veryfi console script (and the equivalent python -m veryfi). The CLI is a thin wrapper around the Python Client and exposes every supported resource as a sub-command — designed for shell users and AI agents that drive the SDK from a terminal.
Verify the install:
veryfi --help
# or, equivalently:
python -m veryfi --helpCredentials are read from environment variables (preferred for agents) or equivalent flags:
| Env var | Flag | Description |
|---|---|---|
VERYFI_CLIENT_ID |
--client-id |
Required |
VERYFI_CLIENT_SECRET |
--client-secret |
Optional — enables HMAC request signing |
VERYFI_USERNAME |
--username |
Required |
VERYFI_API_KEY |
--api-key |
Required |
VERYFI_BASE_URL |
--base-url |
Optional, defaults to https://api.veryfi.com/api/ |
VERYFI_API_VERSION |
--api-version |
Optional, defaults to v8 |
VERYFI_TIMEOUT |
--timeout |
Optional, defaults to 30 seconds |
If any required credential is missing the CLI exits with code 2 and a JSON error on stderr.
export VERYFI_CLIENT_ID=... VERYFI_USERNAME=... VERYFI_API_KEY=...
# Optional:
export VERYFI_CLIENT_SECRET=...
# Documents
veryfi documents process --file /tmp/receipt.jpg --category Travel --category Meals
veryfi documents process-url --file-url https://cdn.example.com/x.pdf --boost-mode --external-id ref-1
veryfi documents list --q Walgreens --created-gt 2024-01-01+00:00:00
veryfi documents get 933760836
veryfi documents update 933760836 --field category="Meals & Entertainment" --field total=11.23
veryfi documents delete 933760836
# Nested line-items / tags
veryfi documents line-items add 933760836 --field description="Extra item" --field total=5.0
veryfi documents tags add-many 933760836 --tag q1 --tag travel
# Multi-page PDF splitting
veryfi documents set split --file /tmp/multi.pdf
veryfi documents set split-url --file-url https://cdn.example.com/multi.pdf --max-pages 5
# Other resources
veryfi bank-statements process --file /tmp/stmt.pdf --category Transfer
veryfi checks process-with-remittance --file /tmp/check.pdf
veryfi business-cards process-url --file-url https://cdn.example.com/card.jpg
veryfi w2s process --file /tmp/w2.pdf
veryfi w2s set split --file /tmp/multi_w2.pdf
veryfi w8s list --created-gt 2024-01-01+00:00:00
veryfi w9s get 33333
veryfi any-docs process --blueprint my_blueprint --file /tmp/custom.pdf
veryfi classify file --file /tmp/unknown.pdf --document-type receipt --document-type invoiceYou can also pipe binary file data via stdin by passing --file -:
curl -s https://cdn.example.com/r.jpg | veryfi documents process --file -Every command emits a JSON response on stdout. Use --output raw for single-line JSON (handy for piping into jq) or --output pretty for sorted keys. Errors are emitted as JSON on stderr and the process exits with a non-zero status:
| Exit code | Meaning |
|---|---|
0 |
Success |
2 |
Missing credentials or invalid CLI arguments |
1-255 |
Veryfi API error — exit code is the HTTP status (clipped to 255) |
70 |
Unexpected error (treat as a bug) |
The exact HTTP status is always included in the stderr payload, e.g.:
{
"error": "Document not found",
"status": 404,
"exception": "ResourceNotFound"
}For endpoints that accept **kwargs (e.g. update_document, add_line_item, update_check), use repeatable --field KEY=VALUE flags or --json-body '<json>'. --field values are JSON-decoded when possible (so total=11.23 becomes a number, enabled=true becomes a boolean, data='{"a":1}' becomes an object) and fall back to plain strings.
Every command at every level supports --help, which lists subcommands or options with their descriptions:
veryfi --help # top-level: lists all resource groups
veryfi documents --help # group: lists process, list, get, tags, line-items, set, …
veryfi documents process --help # leaf: lists every flag with its descriptionFor AI agents and tooling that prefer a machine-readable contract, veryfi schema emits a JSON manifest of every command, its description, and every parameter (name, type, required, repeatable). Agents can ingest this once to register Veryfi as a tool surface without parsing --help text:
veryfi schema | jq '.commands[] | {name, help}'Contributions are welcome! To get started:
- Fork the repository and create your branch from
master. - Install development dependencies:
pip install -r requirements.txt
pip install black pytest responses toxrequirements.txt already includes typer, which is required for the veryfi CLI and its tests.
- Make your changes, then run the test suite:
# Run all tests
pytest
# Run tests across all supported Python versions (3.9–3.12)
tox
# Check code formatting
black --check .
# Auto-format
black .- Open a pull request against
master.
All pull requests must pass the CI checks (tests + black formatting) before merging.
- API documentation: docs.veryfi.com
- SDK reference: veryfi.github.io/veryfi-python
- Support: support@veryfi.com
- Bug reports / feature requests: open an issue
To learn more about Veryfi visit veryfi.com.
See NEWS.md for a history of changes, or browse the GitHub Releases page.
MIT © Veryfi, Inc.
