Skip to content

Copycat L3 indexer and GraphQL#903

Draft
speeddragon wants to merge 58 commits into
edgefrom
feat/new-copycat
Draft

Copycat L3 indexer and GraphQL#903
speeddragon wants to merge 58 commits into
edgefrom
feat/new-copycat

Conversation

@speeddragon
Copy link
Copy Markdown
Collaborator

This PR is the continuation of #837 with GraphQL and pending TX work.

Summary currently reflects the previous PR, to be updated.

Summary

  • Add L1 TX filtering with owner/tag support, offset loading, and block depth indexing (Rani)
  • Add tests, refactor internals, improve logging, and fix operational issues (James)
  • Add per-block item index with depth tracking and inventory mode
  • Add parallel block processing with shared memory budget (configurable workers + byte-level throttling)
  • Add parent containment index: track which block or bundle contains each item
  • Add ~arweave@2.9/parent=<id> endpoint for parent lookups

How to use

Index blocks

Index a range of blocks at depth 3 (L1 TXs → L2 bundle items → L3 nested items):

curl "http://localhost:8005/~copycat@1.0/arweave?from=1890000&to=1889000&depth=3"

For long-running indexing, use the cron wrapper to avoid HTTP timeout killing the job:

curl "http://localhost:8005/~cron@1.0/once?cron-path=~copycat@1.0/arweave&from=-1&to=1862995&depth=3"

Query the inventory

See what was indexed per block, grouped by depth level:

curl "http://localhost:8005/~copycat@1.0/arweave?from=1890000&to=1889990&mode=inventory"

Example response:

{
  "1890000": {
    "depth": 3,
    "items": {
      "1": ["txid1", "txid2"],
      "2": ["bundleitem1", "bundleitem2"],
      "3": ["nesteditem1"]
    }
  }
}

Look up an item's parent

Find which block or bundle contains a given item:

curl "http://localhost:8005/~arweave@2.9/parent=CwxY--7bsqjtw2lneMUjkmYT9CWAYZvsmsO06dY232g"

Response:

{"parents": [{"type": "bundle", "id": "Rve4-grgOw8jXLVw3f6nUhQvlVn6AYm7wA4I5HeKW74"}]}

What gets indexed

The copycat indexer writes these entries to the index store:

  • Offset index (<item-id> → codec + offset + length): maps each item to its location in the Arweave weave
  • Block marker (block/<height>/depth → integer): records that a block was indexed and to what depth
  • Block item index (block/<height>/items/<depth> → item IDs): lists items found at each depth level
  • Parent index (parent/<item-id> → type + parent ref): maps each item to its containing block (height) or bundle (ID)

Configuration

Key Default Description
arweave_block_workers 3 Max concurrent blocks being processed
arweave_index_workers 1 Max concurrent TXs within a block
copycat_memory_budget 6 GB Global memory pool for concurrent downloads
copycat_memory_cap 6 GB Per-TX hard ceiling (skip if larger)

speeddragon and others added 30 commits May 8, 2026 19:34
- Return tagged tuples from latest_height and normalize_height
- Propagate errors through parse_range using maybe block
- Return {error, unavailable} (HTTP 503) on upstream failures
- Validate resolved heights are non-negative in parse_range
- Log original upstream error reason before collapsing to unavailable
- Add regression tests with mock server for both failure paths
charmful0x and others added 28 commits May 8, 2026 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants