This guide covers configuration, usage, and troubleshooting of CDB64-based root transaction indexes for AR.IO Gateway operators.
- Overview
- Configuration
- Source Types
- Local Directory Setup
- Remote Sources
- Partitioned Indexes
- Performance Tuning
- Troubleshooting
- Migration Guide
CDB64 indexes provide O(1) lookups for mapping data item IDs to their root transaction IDs. This enables efficient retrieval of nested bundle data items without querying external services.
Benefits:
- Fast lookups: Constant-time key lookups via hash tables
- Offline operation: No external API dependencies for indexed data items
- Multiple sources: Combine local files, HTTP endpoints, and Arweave-stored indexes
- Hot reloading: Add/remove index files without gateway restart
| Variable | Default | Description |
|---|---|---|
CDB64_ROOT_TX_INDEX_SOURCES |
data/cdb64-root-tx-index |
Comma-separated list of index sources |
CDB64_ROOT_TX_INDEX_WATCH |
true |
Enable file watching for local directories |
ROOT_TX_LOOKUP_ORDER |
db,gateways,cdb,graphql |
Order of lookup sources (CDB64 enabled by default) |
CDB64_REMOTE_RETRIEVAL_ORDER |
gateways,chunks |
Data sources for fetching remote CDB64 files |
CDB64_REMOTE_CACHE_MAX_REGIONS |
100 |
Max cached byte-range regions per remote source |
CDB64_REMOTE_CACHE_TTL_MS |
300000 |
TTL for cached regions (5 minutes) |
CDB64_REMOTE_REQUEST_TIMEOUT_MS |
30000 |
Request timeout for remote sources (30 seconds) |
CDB64_REMOTE_MAX_CONCURRENT_REQUESTS |
4 |
Max concurrent HTTP requests per remote source |
To use CDB64 indexes, add cdb to your ROOT_TX_LOOKUP_ORDER:
# Prefer CDB64, fall back to database, then external services
ROOT_TX_LOOKUP_ORDER=cdb,db,gateways,graphql
# CDB64 only (no external lookups)
ROOT_TX_LOOKUP_ORDER=cdb,dbThe CDB64_ROOT_TX_INDEX_SOURCES variable accepts multiple source types:
A single CDB64 file:
CDB64_ROOT_TX_INDEX_SOURCES=/path/to/index.cdbA directory containing multiple .cdb files (all loaded automatically):
CDB64_ROOT_TX_INDEX_SOURCES=/path/to/indexes/A directory containing manifest.json and partitioned .cdb files:
CDB64_ROOT_TX_INDEX_SOURCES=/path/to/partitioned-index/A CDB64 file served over HTTP/HTTPS:
CDB64_ROOT_TX_INDEX_SOURCES=https://example.com/indexes/root-tx.cdbA partitioned index with manifest served over HTTP:
CDB64_ROOT_TX_INDEX_SOURCES=https://example.com/indexes/manifest.jsonA CDB64 file stored as an Arweave transaction (43-character base64url ID):
CDB64_ROOT_TX_INDEX_SOURCES=ABC123def456ghi789jkl012mno345pqr678stu90vA CDB64 file stored within a bundle, accessed via byte-range:
# Format: rootTxId:offset:size
CDB64_ROOT_TX_INDEX_SOURCES=ABC123...:1024:500000Combine multiple sources (comma-separated, searched in order):
CDB64_ROOT_TX_INDEX_SOURCES=/local/indexes/,https://cdn.example.com/index.cdb,ABC123...-
Create the index directory:
mkdir -p data/cdb64-root-tx-index
-
Add CDB64 files to the directory:
cp my-index.cdb data/cdb64-root-tx-index/
-
The gateway automatically loads all
.cdband.cdb64files from the directory.
When CDB64_ROOT_TX_INDEX_WATCH=true (default), the gateway monitors the directory for changes:
- Adding files: New
.cdbfiles are automatically loaded - Removing files: Deleted files are automatically unloaded
- No restart required: Changes take effect within seconds
Note: Only one directory can be watched at a time. If multiple directory sources are configured, only the first is watched.
For production environments with static indexes:
CDB64_ROOT_TX_INDEX_WATCH=falseThis reduces filesystem overhead when indexes don't change.
HTTP sources support byte-range requests for efficient random access:
CDB64_ROOT_TX_INDEX_SOURCES=https://s3.amazonaws.com/bucket/index.cdbRequirements:
- Server must support HTTP Range requests
- Server should return
Accept-Ranges: bytesheader
For Arweave-stored indexes, the gateway uses its configured data retrieval pipeline:
# Single transaction containing the CDB64 file
CDB64_ROOT_TX_INDEX_SOURCES=ABC123...
# Data item within a bundle (requires byte-range support)
CDB64_ROOT_TX_INDEX_SOURCES=RootTxId:1024:500000Configure the retrieval order for remote CDB64 files:
# Try gateways first, then reconstruct from L1 chunks
CDB64_REMOTE_RETRIEVAL_ORDER=gateways,chunks
# Use Arweave node tx-data endpoint (slower but works for all data)
CDB64_REMOTE_RETRIEVAL_ORDER=gateways,chunks,tx-dataFor very large indexes, partitioning splits data across 256 files by key prefix. This enables:
- Manageable file sizes
- Parallel I/O
- Lazy loading (only accessed partitions are opened)
- Flexible storage (mix local and remote partitions)
index/
manifest.json # Index manifest with partition metadata
00.cdb # Records with keys starting 0x00
01.cdb # Records with keys starting 0x01
...
ff.cdb # Records with keys starting 0xff
CDB64_ROOT_TX_INDEX_SOURCES=/path/to/partitioned-index/The gateway detects partitioned indexes by the presence of manifest.json.
# HTTP
CDB64_ROOT_TX_INDEX_SOURCES=https://cdn.example.com/index/manifest.json
# Arweave (manifest stored as transaction)
CDB64_ROOT_TX_INDEX_SOURCES=ManifestTxId:manifest
# Arweave byte-range (manifest within a bundle)
CDB64_ROOT_TX_INDEX_SOURCES=RootTxId:1024:5000:manifestFor local partitioned indexes, the gateway watches manifest.json for changes. When the manifest is updated (e.g., via atomic rename), the index is automatically reloaded.
For remote sources, tune the byte-range cache:
# More cached regions for high-traffic gateways
CDB64_REMOTE_CACHE_MAX_REGIONS=500
# Longer TTL for stable indexes
CDB64_REMOTE_CACHE_TTL_MS=600000 # 10 minutesPrevent request pile-up when reading from slow remote sources:
# Increase for fast CDNs
CDB64_REMOTE_MAX_CONCURRENT_REQUESTS=8
# Decrease for rate-limited endpoints
CDB64_REMOTE_MAX_CONCURRENT_REQUESTS=2Adjust timeouts based on your network conditions:
# Longer timeout for high-latency connections
CDB64_REMOTE_REQUEST_TIMEOUT_MS=60000 # 1 minutePlace faster sources first in the lookup order:
# Local CDB64 first (fastest), then database, then remote services
ROOT_TX_LOOKUP_ORDER=cdb,db,gateways,graphqlSymptoms: Logs show "Failed to initialize CDB64 source"
Check:
- File exists and is readable by the gateway process
- File has correct extension (
.cdbor.cdb64) - File is a valid CDB64 file (not corrupted)
# Check file permissions
ls -la data/cdb64-root-tx-index/
# Verify file is valid CDB64 (should show header info)
xxd data/cdb64-root-tx-index/index.cdb | head -20Symptoms: New files not detected, removed files still queried
Check:
CDB64_ROOT_TX_INDEX_WATCH=trueis set- Only one directory source is configured (first is watched)
- Files have correct extensions
Logs to look for:
CDB64 file watcher started
CDB64 file added
CDB64 source removed
Symptoms: Timeouts, connection errors for HTTP/Arweave sources
Check:
- Network connectivity to the source
- HTTP Range request support (for HTTP sources)
- Arweave data availability (for Arweave sources)
Adjust configuration:
# Increase timeout
CDB64_REMOTE_REQUEST_TIMEOUT_MS=60000
# Use more reliable retrieval sources
CDB64_REMOTE_RETRIEVAL_ORDER=gateways,chunksSymptoms: "Manifest contains file locations" error for Arweave sources
Cause: Arweave-hosted manifests cannot reference local files.
Solution: Ensure the manifest uses arweave-id, arweave-byte-range, or http location types for all partitions.
Symptoms: High memory usage, OOM errors
Note: CDB64 readers only keep the 4KB header in memory. If you see high memory usage, check:
- Number of open index files
- Cache configuration for remote sources
- Other gateway memory consumers
- Generate CDB64 indexes from your database or external sources
- Add
cdbtoROOT_TX_LOOKUP_ORDER - Configure
CDB64_ROOT_TX_INDEX_SOURCES - Restart gateway
- Generate partitioned index with manifest
- Replace source path with partitioned directory
- Gateway detects partitioned format automatically
Combine local and remote sources for redundancy:
# Local first, remote backup
CDB64_ROOT_TX_INDEX_SOURCES=/local/indexes/,https://backup.example.com/index.cdb- CDB64 File Format Specification - Technical format details
- CDB64 Tools Reference - CLI tools for creating and managing indexes
- Environment Variables - Complete environment variable reference