Convex replicator module by kobiebotha · Pull Request #574 · powersync-ja/powersync-service

kobiebotha · 2026-03-19T05:11:08Z

This adds support for Convex as a replication source. Since Convex itself is open source (technically also FSL), it was quite feasible for me to implement this.

As with any datastore, there are many quirks. I've attempted to document pertinent ones in a README.md in the module root. Required reading is the section titled "Mutation Transaction Atomicity".

To get a feel for the system, run the convex self-host-demo: https://github.com/powersync-ja/self-host-demo/tree/convex-demo. Then simply open two instances of the Convex React demo app (new) (http://localhost:3030) side by side.

When running the demo, to log into the Convex dashboard, you need to jump through some hoops:

Check the container logs for the convex-keygen Docker container
Get the "Admin key" printed to console

TODO

Test against Convex Cloud (it has been a while since I did that)
Fix replication metrics (currently reporting per transaction "page", should report per mutation?)
Update Test Connection logic to ensure that the powersync_checkpoints table and write mutation function exist
Measure replication performance
Docs feat: Convex docs powersync-docs#456

built using gpt-5.3

sync rules will mostly have multiple table

it's always a timestamp

hardcode to 60

also cleaned up handling of convex types

# Conflicts: # packages/schema/src/scripts/compile-json-schema.ts # packages/schema/tsconfig.json

stevensJourney · 2026-05-12T15:22:10Z

+    await this.assertHostAllowed();
+
+    const primaryPath = `/api/${options.endpoint}`;
+    const fallbackPath = `/api/streaming_export/${options.endpoint}`;


I could see this being used on either cloud or self hosted. It seems like the /api/streaming_export fallback routes don't get used. I've removed this for this reason.

stevensJourney · 2026-05-14T15:55:56Z

As part of reviewing the initial POC, I went through the implementation in a bit more detail and then addressed a few items that seemed worth tightening up before this lands.

The main thing I spent time on was the replication flow itself. I reviewed the snapshot + delta approach, where we pin one global Convex snapshot boundary, snapshot selected tables at that same boundary, and then resume document_deltas from that stored LSN. I also looked at resumable snapshots and verified the behaviour by adding integration tests.

I verified the write checkpoint flow. The important ordering here is that we read the Convex head, create the managed PowerSync write checkpoint in the callback, and only then write the Convex marker mutation. That marker write is what gives an idle Convex deployment a later observable delta so the checkpoint can actually be acknowledged. AI generated docs have been added to give more detail about this process. We can remove those if necessary, perhaps they might help other AI agents in the future.

I reviewed the route API adapter as well, especially createReplicationHead, schema/debug table handling, and connection testing. I added a connection test check for the powersync_checkpoints table and mutator so misconfigured Convex projects fail earlier with a more useful message.

On the testing side, I added real integration tests against a local Convex backend and wired them into CI. These cover the module connection path, route API adapter, streaming replication, and resumable snapshots. Some of the original storage-mocking tests were AI-generated. They are less important now that we have real local Convex integration tests, but I left them in place since they might still have some use.

I also did a pass on the Convex API client and value conversion. The current Convex API response typings have been checked with cloud and self-hosted integration tests.

For Convex -> SQLite conversion, I verified the JSON schema responses received from Convex backends and made some cleanup improvements. This process has some limitations which are listed as known issues. I'll mention more on these in an upcoming docs PR.

If you'd like to take this for a spin, feel free to try the React Convex Todolist demo from powersync-ja/powersync-js#952 - this uses a development PowerSync service image. Note that this demo will be moved to its own repository soon.

AI Usage disclaimer:
I believe most of the original implementation was AI generated. Most of my review improvements were hand coded. AI (Codex GPT-5.5 medium) was used to assist with the writing of integration tests - these tests were thoroughly debugged, tweaked and verified. The README content and docs pages are all AI generated. This code has been reviewed with multiple passes by Codex GPT-5.5 medium.

stevensJourney · 2026-05-15T11:38:38Z

+      // TODO! It seems like Convex might not report the schema value for values which have not
+      // been populated in the DB yet. This can cause many issues - and we need to work around this.
+      // We perform runtime checks and conversions at this point.
+      if (value == null) {


After some additional thought, I’m leaning toward disabling json_schemas for SQLite row conversion in the Convex replication path.

The issue is that using schema metadata makes row values inconsistent depending on whether Convex happened to report a field in json_schemas at the time we cached it:

Int64 with schema metadata becomes bigint

the same Int64 without schema metadata stays the raw JSON string

Bytes with schema metadata becomes a Uint8Array/blob`

the same Bytes without schema metadata stays the raw base64 string

That inconsistency is probably worse than preserving the raw wire types. If we only use the types from list_snapshot / document_deltas, then behavior is stable:

Convex Int64 is always a string

Convex Bytes is always a base64 string

number / float64 is always a JS number

booleans come through as booleans, which are already accepted by the Convex-to-SQLite conversion layer

Then users can explicitly normalize ambiguous fields in Sync Streams rules, e.g. CAST(points AS INTEGER) for Int64 columns. This is predictable and avoids cases where the same column changes type depending on whether a populated value existed when json_schemas was fetched.

I think we can still keep json_schemas for table discovery and admin/diagnostic schema reporting, but avoid using it to coerce replicated row values.

stevensJourney · 2026-05-18T09:23:56Z

After making the above changes for JSON Schema usage, some additional improvements could be made to cater for schema changes.

After digging into how the Convex replicator actually uses source metadata, we’ve narrowed the schema-change problem down quite a bit.

For most of our other replicators, schema-change detection is important because it protects against things like replica identity changes, stale relation metadata, table renames/drops, and DDL changes that require a re-snapshot. Convex is different in a few important ways:

_id is always the replication identity, so there is no replica-id drift to detect.
Runtime row conversion does not use json_schemas; it uses the actual JSON document payload from list_snapshot and document_deltas.
Field additions, removals, and type changes should therefore flow through normal document mutations/deltas.
Convex data migrations are expected to be online document writes, so they should replicate as data changes rather than schema-triggered re-snapshots.

The one remaining question was wildcard table discovery. We added an integration test to verify that json_schemas lists schema-defined tables even when they contain no documents. That passed, which means initial wildcard expansion can discover empty tables up front. Based on that, the stream no longer needs to snapshot a table inline when it is first observed in document_deltas; if the table appears later in deltas, the delta payload is the source of truth.

Code/docs changes from this:

Removed the Convex stream’s schema cache / forced schema refresh path.
Exact table patterns now resolve directly from Sync Streams rules.
Wildcards still use json_schemas for initial expansion.
Newly observed selected tables in document_deltas are resolved and marked snapshot-complete, then the delta row is applied directly. No inline snapshot.
Added/updated tests around exact table resolution, wildcard discovery, and empty schema-defined tables.
Added docs/convex/schema-change-handling.md with the rationale and limitations.
Updated the Convex README to reflect the new behavior.

One important limitation we validated: deleting a table from the Convex dashboard does not emit per-document _deleted rows in document_deltas. That means previously replicated rows can remain synced to clients. The docs now recommend using the dashboard “Clear Table” action before deleting a table, or deleting documents through mutation paths that emit document deltas. Otherwise, dashboard/schema-only table removal needs to be treated as a sync-rule/deployment state change where affected PowerSync state may need to be cleared or re-replicated.

…-schemas list of tables (for non-wildcard table patterns).

kobiebotha added 30 commits February 5, 2026 17:58

PoC convex adapter

83955df

built using gpt-5.3

cleaner debug log for bucket storage entries

2006bda

wip: global lsn

e113032

remove single table optimization

33ecf73

sync rules will mostly have multiple table

expirementing with agents.md

de63c81

write checkpoints

ee1e5df

tighten convex LSN format

facb78c

cleanup

a25cd1b

snake_case fix

a34c0b5

slow tests

ec9e046

fix test

950e3d7

remove streaming import cruft

1c8dd1f

simplify LSN representation

817e6b2

it's always a timestamp

comments

c688653

use BaseObserver

6d4e11a

use regular convex mutations for powersync_checkpoints

ceb3941

cleanup

d7e97ef

remove agents.md entirelyu

0c7c370

simplify LSN representation

565d4e7

simplify LSNs even further

92f477e

simplify convexLSN into oblivion

84ba475

remove configurable request timeout

78f821a

hardcode to 60

resumable initial replication

596c1aa

inline snapshotting for new tables

f0c83b6

also cleaned up handling of convex types

clean up type mapping

05945b9

pnpmlock

9b1d02f

Merge remote-tracking branch 'origin/main' into poc-convex

3ba4d1e

# Conflicts: # packages/schema/src/scripts/compile-json-schema.ts # packages/schema/tsconfig.json

upstream api change

1831687

fix regression in snapshotting

96d9be9

Merge branch 'main' into poc-convex

8007ccf

fix dev image release

7421cbc

stevensJourney reviewed May 12, 2026

View reviewed changes

stevensJourney added 14 commits May 12, 2026 17:22

cleanup ConvexAPIClient

31bca10

fix mocked tests

071406b

Merge remote-tracking branch 'origin/main' into module-convex

20443af

cleanup

b719e4e

cleanup schema parsing

1d8e9a0

revert debugging changes to storage

018d18a

Merge branch 'main' into module-convex

1b5e034

cleanup todo comments

489315d

Merge remote-tracking branch 'origin/main' into module-convex

1e4f6e7

handle sqlite conversions according to actual schema responses.

ba0daa1

cleanup

adb916b

cleanup readmes

05ff780

cleanup convex api validations

70f7a5f

Merge remote-tracking branch 'origin/main' into module-convex

78333b0

stevensJourney marked this pull request as ready for review May 14, 2026 15:56

stevensJourney requested a review from rkistner May 14, 2026 15:56

stevensJourney mentioned this pull request May 14, 2026

feat: Convex docs powersync-ja/powersync-docs#456

Draft

2 tasks

update note about int64 values

580f87a

stevensJourney reviewed May 15, 2026

View reviewed changes

benitav mentioned this pull request May 15, 2026

[Drift] Convex added as a new replication source for PowerSync Service (powersync-service #574) powersync-ja/powersync-docs#457

Open

stevensJourney added 2 commits May 15, 2026 14:28

don't use json-schemas for sqlite type conversion

38a819a

updates for schema change handling

0b8ea26

stevensJourney added 4 commits May 18, 2026 11:30

Merge remote-tracking branch 'origin/main' into module-convex

819f152

AI feedback - only return table names if they are present in the json…

30459a1

…-schemas list of tables (for non-wildcard table patterns).

Merge remote-tracking branch 'origin/main' into module-convex

7750855

update table resolve logic

9250304

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convex replicator module#574

Convex replicator module#574
kobiebotha wants to merge 89 commits into
mainfrom
module-convex

kobiebotha commented Mar 19, 2026 •

edited by stevensJourney

Loading

Uh oh!

stevensJourney May 12, 2026

Uh oh!

stevensJourney commented May 14, 2026

Uh oh!

stevensJourney May 15, 2026

Uh oh!

stevensJourney commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kobiebotha commented Mar 19, 2026 • edited by stevensJourney Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stevensJourney May 12, 2026

Choose a reason for hiding this comment

Uh oh!

stevensJourney commented May 14, 2026

Uh oh!

stevensJourney May 15, 2026

Choose a reason for hiding this comment

Uh oh!

stevensJourney commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kobiebotha commented Mar 19, 2026 •

edited by stevensJourney

Loading