Skip to content
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ primary key, btree, for table "public.table_a"

* `CHECK` constraints and `NOT NULL` constraints must be the same or more permissive on any standby node that acts only as a subscriber.

For more information about the Spock extension's advanced functionality, visit [here](docs/features.md).
For more information about the Spock extension's advanced functionality, visit the [Spock documentation](docs/index.md).


## Building the Spock Extension
Expand Down
16 changes: 6 additions & 10 deletions docs/configuring.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,16 +97,12 @@ inadvertently create orphaned foreign key records.
### `spock.conflict_resolution`

`spock.conflict_resolution` sets the resolution method for any detected
conflicts between local data and incoming changes. Possible values include:
conflicts between local data and incoming changes. The only supported value
in current Spock releases is:

- `error` - the replication will stop on error if a conflict is detected and
manual action is required to resolve the conflict.
- `apply_remote` - always apply the change that's conflicting with local
data.
- `keep_local` - keep the local version of the data and ignore the
conflicting change that is coming from the remote node.
- `last_update_wins` - the version of data with newest commit timestamp
will be kept (this can be either local or remote version).
- `last_update_wins` (the default) - the version of data with the newest
commit timestamp will be kept (this can be either the local or the remote
version).

To enable conflict resolution, the `track_commit_timestamp` setting must be
enabled.
Expand Down Expand Up @@ -142,7 +138,7 @@ an ERROR within a transaction:
that is written to the WAL log file; when the subscription is enabled,
replication will resume with the transaction that caused the exception,
followed by the other queued transactions; you can use the
`spock.alter_sub_skip_lsn` function to skip the transaction that caused
`spock.sub_alter_skiplsn` function to skip the transaction that caused
the exception and resume processing with the next transaction in the
queue.

Expand Down
10 changes: 3 additions & 7 deletions docs/conflict_types.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,18 +160,14 @@ kept (`skip` / `keep_local`). The event is recorded in the
### Conflict Resolution Strategies

The `spock.conflict_resolution` GUC controls how resolvable conflicts
(all types except `update_missing` and `update_exists`) are decided:
(all types except `update_missing` and `update_exists`) are decided. In
current Spock releases the only supported value is:

| Strategy | Behavior |
|-----------------------|-----------------------------------------------------------|
| `last_update_wins` | The row with the most recent commit timestamp wins (default). |
| `first_update_wins` | The row with the earliest commit timestamp wins. |
| `apply_remote` | Always apply the incoming remote change. |
| `keep_local` | Always keep the local row. |
| `error` | Raise an ERROR on any conflict. |

The timestamp-based strategies (`last_update_wins` and
`first_update_wins`) require `track_commit_timestamp = on` in
`last_update_wins` requires `track_commit_timestamp = on` in
`postgresql.conf`.

**Tiebreaker:** When two rows have identical commit timestamps, Spock
Expand Down
2 changes: 1 addition & 1 deletion docs/limitations.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ Partial secondary unique indexes are permitted, but will be ignored for
conflict resolution purposes.

`spock.check_all_uc_indexes` is an experimental
[GUC](https://github.com/pgEdge/spock/blob/main/docs/guc_settings.md) that
[GUC](configuring.md) that
adds `INSERT` conflict resolution by allowing Spock to consider all unique
constraints, not just the primary key or replica identity.

Expand Down
12 changes: 9 additions & 3 deletions docs/managing/read_only.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,9 +72,15 @@ COMMIT;

## Behavior of `all` mode

In `all` mode, apply workers detect the setting and stop consuming inbound WAL.
When the mode is switched back to `off` or `local`, replication resumes from
where it left off — no data is lost.
In `all` mode, apply workers detect the setting and stop consuming inbound
WAL — internally each worker raises a FATAL and exits, and the manager
restarts it once read-only is cleared. When the mode is switched back to
`off` or `local`, replication resumes from where it left off, **provided
the upstream replication slot still retains the necessary WAL**. Plan for
slot retention (`max_slot_wal_keep_size`, disk capacity) before placing a
node in `all` mode for an extended period; if the upstream slot is
recycled while the node is read-only, replication cannot resume from the
prior position and the node will need to be re-synchronized.

Notes:
- Only superusers can set and unset the `spock.readonly` parameter.
Expand Down
9 changes: 9 additions & 0 deletions docs/managing/spock_autoddl.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,15 @@ automatic DDL replication, set the following parameters to `on`:
will be added into the `default` replication set; alternatively, they will
be added to the `default_insert_only` replication set.

To completely block all DDL across the cluster (including DDL that would
otherwise be replicated automatically), use:

* `spock.deny_all_ddl` - a boolean value, default `false`. When set to `true`,
Spock rejects any DDL statement executed on the node. This is useful as a
guard during sensitive maintenance windows or while a node is being added
or repaired. The setting can be changed by a superuser at runtime
(`PGC_SUSET`); a server reload is not required.

It's best to set these parameters to `on` only when the database schema
matches exactly on all nodes - either when all databases have no objects, or
when all databases have exactly the same objects and all tables are added to
Expand Down
46 changes: 23 additions & 23 deletions docs/modify/zodan/zodan_tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,15 +154,15 @@ FROM dblink(
'host=127.0.0.1 dbname=inventory port=5432 user=pgedge password=1safepassword',
'SELECT extversion FROM pg_extension WHERE extname = ''spock'''
) AS t(version text);
-- Expected: 5.0.4
-- Expected: matches the version installed on every node in the cluster

-- Check new node version
SELECT version
FROM dblink(
'host=127.0.0.1 dbname=inventory port=5435 user=pgedge password=1safepassword',
'SELECT extversion FROM pg_extension WHERE extname = ''spock'''
) AS t(version text);
-- Expected: 5.0.4
-- Expected: matches the version installed on every node in the cluster

-- Check all existing cluster nodes (n2, n3)
SELECT node_name, version
Expand All @@ -173,12 +173,12 @@ FROM dblink(
FROM spock.node n'
) AS t(node_name text, version text);

-- Expected output:
-- node_name | version
-- -----------+--------------
-- n1 | 5.0.4
-- n2 | 5.0.4
-- n3 | 5.0.4
-- Expected output: every node reports the same Spock version, e.g.
-- node_name | version
-- -----------+----------
-- n1 | 5.0.6
-- n2 | 5.0.6
-- n3 | 5.0.6
```

### Validate Prerequisites
Expand Down Expand Up @@ -597,8 +597,8 @@ FROM dblink(
SELECT *
FROM dblink(
'host=127.0.0.1 dbname=inventory port=5432 user=alice password=1safepassword',
'CALL spock.wait_for_sync_event(true, ''n2'', ''0/1C9F400''::pg_lsn, 1200000)'
) AS t(result text);
'CALL spock.wait_for_sync_event(true, ''n2'', ''0/1C9F400''::pg_lsn, 1200)'
) AS t(result bool);
```

#### Sync n3 to n1
Expand All @@ -623,8 +623,8 @@ FROM dblink(
SELECT *
FROM dblink(
'host=127.0.0.1 dbname=inventory port=5432 user=alice password=1safepassword',
'CALL spock.wait_for_sync_event(true, ''n3'', ''0/1D0E510''::pg_lsn, 1200000)'
) AS t(result text);
'CALL spock.wait_for_sync_event(true, ''n3'', ''0/1D0E510''::pg_lsn, 1200)'
) AS t(result bool);
```

### Copy the Source to New Subscription
Expand Down Expand Up @@ -752,7 +752,7 @@ FROM dblink(
On n4, wait for the sync marker that matches the returned LSN (0/1E1F620)
to arrive and be processed. This is a blocking call; the call will not
return until the n4 subscription from n1 has replicated up to this LSN.
The timeout (1200000 milliseconds = 20 minutes) prevents waiting forever
The timeout (1200 seconds = 20 minutes) prevents waiting forever
if something goes wrong.
Comment on lines +755 to 756
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Timeout value is inconsistent with the earlier prerequisite note.

This section says 1200 seconds, but the earlier warning still says ZODAN uses an internal 180-second timeout. Please reconcile or explicitly explain the difference to avoid operator confusion.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/modify/zodan/zodan_tutorial.md` around lines 755 - 756, The doc
currently lists "1200 seconds = 20 minutes" here but earlier mentions an
"internal 180-second timeout"; reconcile by either aligning the numbers or
adding an explicit clarification: update the sentence that reads "The timeout
(1200 seconds = 20 minutes) prevents waiting forever if something goes wrong."
to explain that the 1200s value is the overall operation/monitoring timeout
while ZODAN's internal per-request timeout remains 180 seconds (or change one of
the values so both refer to the same timeout), and also update the earlier
"internal 180-second timeout" sentence to reference the overall timeout if you
choose to keep 1200s—ensure both phrases ("1200 seconds = 20 minutes" and
"internal 180-second timeout") are consistent or clearly differentiated in
wording.


In the following example, the command waits for the sync event on n4:
Expand All @@ -761,8 +761,8 @@ In the following example, the command waits for the sync event on n4:
SELECT *
FROM dblink(
'host=127.0.0.1 dbname=inventory port=5435 user=alice password=1safepassword',
'CALL spock.wait_for_sync_event(true, ''n1'', ''0/1E1F620''::pg_lsn, 1200000)'
) AS t(result text);
'CALL spock.wait_for_sync_event(true, ''n1'', ''0/1E1F620''::pg_lsn, 1200)'
) AS t(result bool);
```


Expand Down Expand Up @@ -844,7 +844,7 @@ FROM dblink(
)
SELECT pg_replication_slot_advance(''spk_inventory_n2_sub_n2_n4'', lsn)
FROM lsn_cte'
) AS t(result text);
) AS t(result bool);
```
Comment on lines +847 to 848
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

PostgreSQL documentation (v17/v18): What is the exact return signature of pg_replication_slot_advance(slot_name, lsn)?

💡 Result:

The exact return signature of pg_replication_slot_advance(slot_name name, upto_lsn pg_lsn) is a record with fields (slot_name name, end_lsn pg_lsn). This is consistent across PostgreSQL v17 [1] and v18 [2] official documentation, as well as secondary sources like pgPedia [3] and source code [4]. The function advances the confirmed position of the named replication slot to the specified LSN (or as far as possible without going backwards or beyond the current WAL insert location) and returns the slot name and the actual end LSN achieved [1][2].

Citations:


pg_replication_slot_advance result type is documented incorrectly in both examples.

AS t(result bool) is incorrect. The function returns (slot_name name, end_lsn pg_lsn) per PostgreSQL v17/v18 documentation, not a single boolean field. This will cause copy/paste failures when used in actual PostgreSQL code.

Proposed doc fix
-) AS t(result bool);
+) AS t(slot_name text, end_lsn pg_lsn);

Also applies to: lines 878-879

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/modify/zodan/zodan_tutorial.md` around lines 847 - 848, The examples
documenting pg_replication_slot_advance incorrectly declare the result as AS
t(result bool); update both occurrences to reflect the actual return signature
of pg_replication_slot_advance by using a record type with the two columns
(slot_name name, end_lsn pg_lsn) so the examples match PostgreSQL v17/v18 and
will copy/paste without errors; locate the examples around the
pg_replication_slot_advance usage and replace the incorrect AS t(result bool)
with an AS clause declaring (slot_name name, end_lsn pg_lsn) for both instances.


#### Repeat for n3 to n4
Expand Down Expand Up @@ -875,7 +875,7 @@ FROM dblink(
)
SELECT pg_replication_slot_advance(''spk_inventory_n3_sub_n3_n4'', lsn)
FROM lsn_cte'
) AS t(result text);
) AS t(result bool);
```


Expand Down Expand Up @@ -925,7 +925,7 @@ FROM dblink(
subscription_name := ''sub_n2_n4'',
immediate := true
)'
) AS t(result text);
) AS t(result bool);
```

#### Wait for Stored Sync Event from n2
Expand All @@ -948,8 +948,8 @@ SELECT sync_lsn FROM temp_sync_lsns WHERE origin_node = 'n2';
SELECT *
FROM dblink(
'host=127.0.0.1 dbname=inventory port=5435 user=alice password=1safepassword',
'CALL spock.wait_for_sync_event(true, ''n2'', ''0/1A7D1E0''::pg_lsn, 1200000)'
) AS t(result text);
'CALL spock.wait_for_sync_event(true, ''n2'', ''0/1A7D1E0''::pg_lsn, 1200)'
) AS t(result bool);
```

#### Verify the Subscription is Replicating
Expand Down Expand Up @@ -991,7 +991,7 @@ FROM dblink(
subscription_name := ''sub_n3_n4'',
immediate := true
)'
) AS t(result text);
) AS t(result bool);

-- Retrieve stored LSN
SELECT sync_lsn FROM temp_sync_lsns WHERE origin_node = 'n3';
Expand All @@ -1001,8 +1001,8 @@ SELECT sync_lsn FROM temp_sync_lsns WHERE origin_node = 'n3';
SELECT *
FROM dblink(
'host=127.0.0.1 dbname=inventory port=5435 user=alice password=1safepassword',
'CALL spock.wait_for_sync_event(true, ''n3'', ''0/1B8E2F0''::pg_lsn, 1200000)'
) AS t(result text);
'CALL spock.wait_for_sync_event(true, ''n3'', ''0/1B8E2F0''::pg_lsn, 1200)'
) AS t(result bool);

-- Verify status
SELECT subscription_name, status, provider_node
Expand Down
6 changes: 3 additions & 3 deletions docs/monitoring/spock_info.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ The following table describes informational tables in the `spock` schema:

| Table Name | Description |
|---------------------|----------------------------|
| `channel_summary_stats` | This table tracks per-table statistics for a given subscription, including total inserts, updates, deletes, conflicts, and delta apply column changes. The table includes the following columns: `subid`, `sub_name`, `n_tup_ins`, `n_tup_upd`, `n_tup_del`, `n_conflict`, `n_dca` |
| `channel_table_stats` | This table is similar to `channel_summary_stats`, but aggregates statistics across subscriptions, showing overall metrics grouped by subscription. The table includes the following columns: `subid`, `relid`, `sub_name`, `table_name`, `n_tup_ins`, `n_tup_upd`, `n_tup_del`, `n_conflict`, `n_dca` |
| `channel_table_stats` | This view tracks per-table statistics for a given subscription, including total inserts, updates, deletes, conflicts, and delta apply column changes. The view includes the following columns: `subid`, `relid`, `sub_name`, `table_name`, `n_tup_ins`, `n_tup_upd`, `n_tup_del`, `n_conflict`, `n_dca` |
| `channel_summary_stats` | This view aggregates statistics from `channel_table_stats` across tables, showing overall metrics grouped by subscription. The view includes the following columns: `subid`, `sub_name`, `n_tup_ins`, `n_tup_upd`, `n_tup_del`, `n_conflict`, `n_dca` |
| `depend` | This is an internal-use table that tracks dependent objects (e.g., tables added for replication or row filters). If such objects are dropped, they are also removed from Spock’s tracking. The table includes the following columns: `classid`, `objid`, `objsubid`, `refclassid`, `refobjid`, `refobjsubid`, `deptype` |
| `exception_log` | This table logs unrecoverable errors or conflicts encountered by Spock during the replication process. The table includes the following columns: `remote_origin`, `remote_commit_ts`, `command_counter`, `retry_errored_at`, `remote_xid`, `local_origin`, `local_commit_ts`, `table_schema`, `table_name`, `operation` (contains one of the following: `BEGIN`, `COMMIT`, `INSERT`, `UPDATE`, `DELETE`, or `DDL`), `local_tup`, `remote_old_tup`, `remote_new_tup`, `ddl_statement`, `ddl_user`, `error_message` |
| `exception_status` | This table is not used internally by Spock. This table exists to support ACE by tracking specific status details. The table includes the following information columns: `remote_origin`, `remote_commit_ts`, `retry_errored_at`, `remote_xid`, `status`, `resolved_at`, `resolution_details` |
Expand All @@ -22,7 +22,7 @@ The following table describes informational tables in the `spock` schema:
| `replication_set_table` | This table contains one row per table that is in any replication set. It contains the `set_id`, `set_reloid` (the table name), `set_att_list`, and `set_row_filter`. The last two columns contain the row and column filters on that table for replication. |
| `resolutions` | This table contains one row per resolution made on this node. It contains the `id`, `node_name`, `log_time`, `relname`, `idxname`, `conflict_type` (`update_update`), `conflict_resolution` (`keep_local`), `local_origin`, `local_tuple`, `local_xid, local_timestamp`, `remote_origin`, `remote_tuple`, `remote_xid`, `remote_timestamp`, `remote_lsn`. |
| `sequence_state` | This is an internal-use table that stores state information for native sequence synchronization. The table includes the following columns: `seqoid`, `cache_size`, `last_value` |
| `subscription` | This table contains one row per subscription. It contains the following columns: `sub_id`, `sub_name`, `sub_origin`, `sub_target`, `sub_origin_if`, `sub_target_if`, `sub_enabled`, `sub_slot_name`, `sub_replication_sets` (an array of replication sets that have been added to the subscription), `sub_forward_origins`, `sub_apply_delay`, and `sub_force_text_transfer`. |
| `subscription` | This table contains one row per subscription. It contains the following columns: `sub_id`, `sub_name`, `sub_origin`, `sub_target`, `sub_origin_if`, `sub_target_if`, `sub_enabled`, `sub_slot_name`, `sub_replication_sets` (an array of replication sets that have been added to the subscription), `sub_forward_origins`, `sub_apply_delay`, `sub_force_text_transfer`, `sub_skip_lsn` (a `pg_lsn` set by `spock.sub_alter_skiplsn`), `sub_skip_schema` (an array of schemas to skip during apply), and `sub_created_at` (timestamp of subscription creation). |
| `tables` | This table contains one row per table in the database. It contains the following columns: `relid`, `nspname` (the schema), `relname` (the tablename), and `set_name` (`null` if that table is not added to any replication set). |

## Examples
Expand Down
17 changes: 14 additions & 3 deletions docs/monitoring/spock_sync_event.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,13 @@ Invoked on the provider node, this function returns the current `pg_lsn`
value, representing a point-in-time value for your replication scenario. The
syntax of `spock.sync_event` is:

`spock.sync_event() RETURNS pg_lsn`
`spock.sync_event(transactional boolean DEFAULT false) RETURNS pg_lsn`

When `transactional` is `false` (the default), the sync event marker is
emitted into the WAL stream immediately, independent of the calling
transaction. When `transactional` is `true`, the marker is bound to the
calling transaction and is only visible to subscribers if the transaction
commits.

Invoked on a subscriber node, `spock.wait_for_sync_event` is available in two
flavors - the first uses the origin_id (an `oid`) as an identifier for the
Expand Down Expand Up @@ -62,6 +68,11 @@ On a provider node:

On a subscriber node:

`CALL spock.wait_for_sync_event(OUT result, 'provider_node', '0/16342B0', 10);`
`-- result: true (if applied within 10s), false otherwise`
```sql
CALL spock.wait_for_sync_event(NULL, 'provider_node', '0/16342B0', 10);
-- result: true (if applied within 10s), false otherwise
```

The first parameter is the OUT `result` placeholder; pass `NULL` for it in
the `CALL` statement and read the OUT value from the procedure result.

Comment on lines +71 to 78
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Clarify the earlier wait_for_sync_event invocation style to avoid a misleading zero-arg call.

This updated example is correct, but the earlier inline spock.wait_for_sync_event() (Line 19) reads like a literal invocation and conflicts with the required parameters shown here. Please make the intro reference explicitly symbolic (for example, spock.wait_for_sync_event(...)) for consistency.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/monitoring/spock_sync_event.md` around lines 71 - 78, The earlier inline
call to spock.wait_for_sync_event() is misleading because the function requires
parameters; update the intro/reference (where it currently shows
spock.wait_for_sync_event() on Line 19) to a symbolic form such as
spock.wait_for_sync_event(...) so readers know it’s illustrative and not a valid
zero-arg invocation; ensure references to the procedure name
spock.wait_for_sync_event and the OUT placeholder usage (NULL) remain consistent
with the full example shown later.

25 changes: 15 additions & 10 deletions docs/recovery/catastrophic_node_failure.md
Original file line number Diff line number Diff line change
Expand Up @@ -307,15 +307,17 @@ when it should be `replicating` indicates that the node is behind. For
example:

```
sub_name | status | provider_node | replication_sets | lag
-----------+-------------+---------------+---------------------------------------+-----------
sub_n2_n1 | down | n1 | {default,default_insert_only,ddl_sql} | 00:05:12
sub_n2_n3 | replicating | n3 | {default,default_insert_only,ddl_sql} | 00:00:00
subscription_name | status | provider_node | replication_sets
-------------------+-------------+---------------+---------------------------------------
sub_n2_n1 | down | n1 | {default,default_insert_only,ddl_sql}
sub_n2_n3 | replicating | n3 | {default,default_insert_only,ddl_sql}
```

In this output, `sub_n2_n1` is down and lagging — that confirms n2 did not
receive all of n1's transactions before n1 failed. The subscriptions to n3,
n4, and n5 are still healthy.
In this output, `sub_n2_n1` is `down` — that confirms n2 lost its
connection to n1 (which has failed). The subscriptions to n3, n4, and n5
are still healthy. To gauge how far behind n2 is, query
[`spock.lag_tracker`](../monitoring/lag_tracking.md) for replication lag
in bytes and time.

Determine the approximate time when the failed node (or nodes) failed.
You'll need this timestamp for the ACE commands in Phase 3 and 4. If you
Expand Down Expand Up @@ -377,14 +379,17 @@ leave others out of sync.
### Step 1: Get a List of All Tables to Check

First, get the list of tables that participate in replication. You can get
this from Spock replication sets. Connect to any surviving node (for
this from the `spock.tables` view, which lists every table along with the
replication set (if any) it belongs to. Connect to any surviving node (for
example, n3) and run:

```sql
SELECT * FROM spock.repset_list_tables('default');
SELECT nspname, relname, set_name
FROM spock.tables
WHERE set_name IS NOT NULL
ORDER BY set_name, nspname, relname;
```

If you use multiple replication sets, run this for each set.
Alternatively, you can list all tables in the schema you replicate:

```sql
Expand Down
9 changes: 5 additions & 4 deletions docs/spock_functions/functions/spock_max_proto_version.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
## NAME

spock.spock_max_proto_version()
spock_max_proto_version()

### SYNOPSIS

spock.spock_max_proto_version()
spock_max_proto_version()

### RETURNS

Expand All @@ -29,7 +29,8 @@ different Spock releases.
The protocol version is returned as an integer value. Higher numbers
indicate newer protocol versions with additional features.

This is a read-only query function that does not modify data.
This is a read-only query function that does not modify data. This function is
defined in the public schema.

### ARGUMENTS

Expand All @@ -40,7 +41,7 @@ This function takes no arguments.
The following command shows that the current version of Spock uses protocol
version 4:

postgres=# SELECT spock.spock_max_proto_version();
postgres=# SELECT spock_max_proto_version();
spock_max_proto_version
-------------------------
4
Expand Down
Loading
Loading