From db794d980bb6b275defdcfaf14ee5162bdb7bfa5 Mon Sep 17 00:00:00 2001 From: Rich Loveland Date: Mon, 23 Feb 2026 15:48:50 -0500 Subject: [PATCH 1/2] Clarify uncertainty error keys and document randomized anchor key tuning Fixes: - DOC-14735 - DOC-15172 Summary of changes: - v25.4 and below, and v26.1 and later: - Update ReadWithinUncertaintyIntervalError docs to: - Extend the example to include a larger `meta={key=/Table/...}` fragment - Add randomized anchor key tuning guidance to the shared `performance/reduce-contention.md` include: - Describe when to consider `transaction.randomized_anchor_key.enabled` for workloads with large concurrent UPDATE/INSERT batches that create transaction record (anchor) hotspots - Emphasize that this setting randomizes anchor placement (not user data) to spread txn records across ranges - In v26.1, tie this guidance to the improved observability: contention events and logs already report the actual contention key, so anchor randomization is a secondary knob once true conflict locations are understood - v25.4 and below only: - Update ReadWithinUncertaintyIntervalError docs to: - Add an "Interpreting log messages" callout explaining that the logged key is the transaction record (anchor) key, not necessarily the actual conflict key - Note that SERIALIZATION_CONFLICT contention events recorded when `sql.contention.record_serialization_conflicts.enabled` is true also use this anchor key - v26.1 and later only: - Update ReadWithinUncertaintyIntervalError docs to: - Add an "Interpreting log messages" callout clarifying that in 26.1+ the logged key represents the actual contention key, and that earlier versions reported the anchor key - Document that contention events now use this contention key when the `sql.contention.record_serialization_conflicts.enabled` setting is enabled - Update 'Troubleshoot lock contention' page to include a new section on using randomized anchor keys --- .../v25.4/performance/reduce-contention.md | 4 +++- .../v26.1/performance/reduce-contention.md | 4 +++- .../transaction-retry-error-reference.md | 6 +++++- .../transaction-retry-error-reference.md | 6 +++++- .../v26.1/troubleshoot-lock-contention.md | 20 +++++++++++++++++++ 5 files changed, 36 insertions(+), 4 deletions(-) diff --git a/src/current/_includes/v25.4/performance/reduce-contention.md b/src/current/_includes/v25.4/performance/reduce-contention.md index 0f52e1f212a..bbf296f16cf 100644 --- a/src/current/_includes/v25.4/performance/reduce-contention.md +++ b/src/current/_includes/v25.4/performance/reduce-contention.md @@ -10,8 +10,10 @@ - If applicable to your workload, assign [column families]({% link {{ page.version.version }}/column-families.md %}#default-behavior) and separate columns that are frequently read and written into separate columns. Transactions will operate on disjoint column families and reduce the likelihood of conflicts. +- For workloads where large [`UPDATE`]({% link {{ page.version.version }}/update.md %}) or [`INSERT`]({% link {{ page.version.version }}/insert.md %}) transactions run concurrently over similar key ranges, watch for [transaction record]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#transaction-records) anchor hotspots (for example, many concurrent transactions with [records]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#transaction-records) on the same [range]({% link {{ page.version.version }}/architecture/glossary.md %}#range)). In these cases, consider enabling the [`transaction.randomized_anchor_key.enabled`]({% link {{ page.version.version }}/cluster-settings.md %}#setting-kv-transaction-randomized-anchor-key-enabled) cluster setting to randomize the location of transaction anchor keys. This can spread transaction records across ranges and reduce hotspotting. Only use this setting after confirming anchor hotspots via contention and range-level observability. + - As a last resort, consider adjusting the [closed timestamp interval]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#closed-timestamps) using the `kv.closed_timestamp.target_duration` [cluster setting]({% link {{ page.version.version }}/cluster-settings.md %}) to reduce the likelihood of long-running write transactions having their [timestamps pushed]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#timestamp-cache). This setting should be carefully adjusted if **no other mitigations are available** because there can be downstream implications (e.g., historical reads, change data capture feeds, statistics collection, handling zone configurations, etc.). For example, a transaction _A_ is forced to refresh (i.e., change its timestamp) due to hitting the maximum [_closed timestamp_]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#closed-timestamps) interval (closed timestamps enable [Follower Reads](follower-reads.html#how-stale-follower-reads-work) and [Change Data Capture (CDC)](change-data-capture-overview.html)). This can happen when transaction _A_ is a long-running transaction, and there is a write by another transaction to data that _A_ has already read. {{site.data.alerts.callout_info}} If you increase the `kv.closed_timestamp.target_duration` setting, it means that you are increasing the amount of time by which the data available in [Follower Reads]({% link {{ page.version.version }}/follower-reads.md %}) and [CDC changefeeds]({% link {{ page.version.version }}/change-data-capture-overview.md %}) lags behind the current state of the cluster. In other words, there is a trade-off here: if you absolutely must execute long-running transactions that execute concurrently with other transactions that are writing to the same data, you may have to settle for longer delays on Follower Reads and/or CDC to avoid frequent serialization errors. The anomaly that would be exhibited if these transactions were not retried is called [write skew](https://www.cockroachlabs.com/blog/what-write-skew-looks-like/). -{{site.data.alerts.end}} \ No newline at end of file +{{site.data.alerts.end}} diff --git a/src/current/_includes/v26.1/performance/reduce-contention.md b/src/current/_includes/v26.1/performance/reduce-contention.md index 0f52e1f212a..bbf296f16cf 100644 --- a/src/current/_includes/v26.1/performance/reduce-contention.md +++ b/src/current/_includes/v26.1/performance/reduce-contention.md @@ -10,8 +10,10 @@ - If applicable to your workload, assign [column families]({% link {{ page.version.version }}/column-families.md %}#default-behavior) and separate columns that are frequently read and written into separate columns. Transactions will operate on disjoint column families and reduce the likelihood of conflicts. +- For workloads where large [`UPDATE`]({% link {{ page.version.version }}/update.md %}) or [`INSERT`]({% link {{ page.version.version }}/insert.md %}) transactions run concurrently over similar key ranges, watch for [transaction record]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#transaction-records) anchor hotspots (for example, many concurrent transactions with [records]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#transaction-records) on the same [range]({% link {{ page.version.version }}/architecture/glossary.md %}#range)). In these cases, consider enabling the [`transaction.randomized_anchor_key.enabled`]({% link {{ page.version.version }}/cluster-settings.md %}#setting-kv-transaction-randomized-anchor-key-enabled) cluster setting to randomize the location of transaction anchor keys. This can spread transaction records across ranges and reduce hotspotting. Only use this setting after confirming anchor hotspots via contention and range-level observability. + - As a last resort, consider adjusting the [closed timestamp interval]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#closed-timestamps) using the `kv.closed_timestamp.target_duration` [cluster setting]({% link {{ page.version.version }}/cluster-settings.md %}) to reduce the likelihood of long-running write transactions having their [timestamps pushed]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#timestamp-cache). This setting should be carefully adjusted if **no other mitigations are available** because there can be downstream implications (e.g., historical reads, change data capture feeds, statistics collection, handling zone configurations, etc.). For example, a transaction _A_ is forced to refresh (i.e., change its timestamp) due to hitting the maximum [_closed timestamp_]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#closed-timestamps) interval (closed timestamps enable [Follower Reads](follower-reads.html#how-stale-follower-reads-work) and [Change Data Capture (CDC)](change-data-capture-overview.html)). This can happen when transaction _A_ is a long-running transaction, and there is a write by another transaction to data that _A_ has already read. {{site.data.alerts.callout_info}} If you increase the `kv.closed_timestamp.target_duration` setting, it means that you are increasing the amount of time by which the data available in [Follower Reads]({% link {{ page.version.version }}/follower-reads.md %}) and [CDC changefeeds]({% link {{ page.version.version }}/change-data-capture-overview.md %}) lags behind the current state of the cluster. In other words, there is a trade-off here: if you absolutely must execute long-running transactions that execute concurrently with other transactions that are writing to the same data, you may have to settle for longer delays on Follower Reads and/or CDC to avoid frequent serialization errors. The anomaly that would be exhibited if these transactions were not retried is called [write skew](https://www.cockroachlabs.com/blog/what-write-skew-looks-like/). -{{site.data.alerts.end}} \ No newline at end of file +{{site.data.alerts.end}} diff --git a/src/current/v25.4/transaction-retry-error-reference.md b/src/current/v25.4/transaction-retry-error-reference.md index f3ca35fae76..88082174364 100644 --- a/src/current/v25.4/transaction-retry-error-reference.md +++ b/src/current/v25.4/transaction-retry-error-reference.md @@ -192,7 +192,7 @@ See [Minimize transaction retry errors](#minimize-transaction-retry-errors) for ``` TransactionRetryWithProtoRefreshError: ReadWithinUncertaintyIntervalError: read at time 1591009232.376925064,0 encountered previous write with future timestamp 1591009232.493830170,0 within uncertainty interval `t <= 1591009232.587671686,0`; - observed timestamps: [{1 1591009232.587671686,0} {5 1591009232.376925064,0}] + observed timestamps: [{1 1591009232.587671686,0} {5 1591009232.376925064,0}] meta={key=/Table/9373/10/5293921467191001339/0 ...} ``` **Error type:** Serialization error @@ -221,6 +221,10 @@ Under [`READ COMMITTED`]({% link {{ page.version.version }}/read-committed.md %} 1. `ReadWithinUncertaintyIntervalError` errors are only returned in rare cases that can be avoided by adjusting the [result buffer size](#result-buffer-size). +**Interpreting log messages:** + +In CockroachDB {{ page.version.version }}, the `meta={... key=/Table/...}` field that appears in log output for `ReadWithinUncertaintyIntervalError` and related serialization conflicts identifies the transaction's [transaction record (anchor) key]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#transaction-records), not necessarily the key where the conflict occurred. This anchor key is the first key written by the transaction and is where its record is stored. Contention events that are recorded when [`sql.contention.record_serialization_conflicts.enabled`]({% link {{ page.version.version }}/cluster-settings.md %}#setting-sql-contention-record-serialization-conflicts-enabled) is `true` use this anchor key when populating the recorded conflict. + {{site.data.alerts.callout_info}} Uncertainty errors are a sign of transaction conflict. For more information about transaction conflicts, see [Transaction conflicts]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#transaction-conflicts). {{site.data.alerts.end}} diff --git a/src/current/v26.1/transaction-retry-error-reference.md b/src/current/v26.1/transaction-retry-error-reference.md index f3ca35fae76..0279318a6de 100644 --- a/src/current/v26.1/transaction-retry-error-reference.md +++ b/src/current/v26.1/transaction-retry-error-reference.md @@ -192,7 +192,7 @@ See [Minimize transaction retry errors](#minimize-transaction-retry-errors) for ``` TransactionRetryWithProtoRefreshError: ReadWithinUncertaintyIntervalError: read at time 1591009232.376925064,0 encountered previous write with future timestamp 1591009232.493830170,0 within uncertainty interval `t <= 1591009232.587671686,0`; - observed timestamps: [{1 1591009232.587671686,0} {5 1591009232.376925064,0}] + observed timestamps: [{1 1591009232.587671686,0} {5 1591009232.376925064,0}] meta={id=a3458962 key=/Table/9373/10/5293921467191001339/0 ...} ``` **Error type:** Serialization error @@ -221,6 +221,10 @@ Under [`READ COMMITTED`]({% link {{ page.version.version }}/read-committed.md %} 1. `ReadWithinUncertaintyIntervalError` errors are only returned in rare cases that can be avoided by adjusting the [result buffer size](#result-buffer-size). +**Interpreting log messages:** + +In CockroachDB {{ page.version.version }}, the `meta={... key=/Table/...}` field in log output for `ReadWithinUncertaintyIntervalError` and related serialization conflicts identifies the **actual contention key** (the key where the conflicting read or write occurred). Earlier versions could instead report the transaction's [anchor key]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#transaction-records), which made it harder to locate the true point of conflict. Contention events that are recorded when [`sql.contention.record_serialization_conflicts.enabled`]({% link {{ page.version.version }}/cluster-settings.md %}#setting-sql-contention-record-serialization-conflicts-enabled) is `true` use this contention key when populating the recorded conflict. + {{site.data.alerts.callout_info}} Uncertainty errors are a sign of transaction conflict. For more information about transaction conflicts, see [Transaction conflicts]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#transaction-conflicts). {{site.data.alerts.end}} diff --git a/src/current/v26.1/troubleshoot-lock-contention.md b/src/current/v26.1/troubleshoot-lock-contention.md index 8a398623980..6c4f4346465 100644 --- a/src/current/v26.1/troubleshoot-lock-contention.md +++ b/src/current/v26.1/troubleshoot-lock-contention.md @@ -271,6 +271,26 @@ Consider the following when using [historical queries]({% link {{ page.version.v - Historical queries operate below [closed timestamps]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#closed-timestamps) and therefore have perfect concurrency characteristics - they never wait on anything and never block anything. - Historical queries have the best possible performance, since they are served by the nearest [replica]({% link {{ page.version.version }}/architecture/glossary.md %}#replica). +### Randomize transaction anchor keys for large batched updates or inserts + +In some workloads with large batched [`UPDATE`]({% link {{ page.version.version }}/update.md %}) or [`INSERT`]({% link {{ page.version.version }}/insert.md %}) transactions, many concurrent transactions can end up with their [transaction records]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#transaction-records) colocated on the same [range]({% link {{ page.version.version }}/architecture/glossary.md %}#range). The [leaseholder]({% link {{ page.version.version }}/architecture/overview.md %}#architecture-leaseholder) for that range must coordinate [intent resolution]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#write-intents) for all of those transactions, and can become a [hotspot]({% link {{ page.version.version }}/understand-hotspots.md %}) even if the actual user data being modified is well-distributed. + +When troubleshooting contention or hotspots that you have confirmed are due to transaction record placement (for example, using the guidance in [Monitor and analyze transaction contention]({% link {{ page.version.version }}/monitor-and-analyze-transaction-contention.md %})), you can experiment with enabling the cluster setting [`kv.transaction.randomized_anchor_key.enabled`]({% link {{ page.version.version }}/cluster-settings.md %}#setting-kv-transaction-randomized-anchor-key-enabled). + +When set to `true`, this setting randomizes a transaction's *anchor key* (the key where its transaction record is stored). This can spread transaction records across ranges and reduce hotspots for large batched update or insert workloads. + +Consider the following when using this setting: + +- It is primarily useful for workloads that issue large batched updates or inserts and show clear evidence of transaction-record hotspotting. +- It does **not** change which user data rows are read or written; it only affects where the transaction record (metadata) is stored. +- Treat this as a tuning and troubleshooting knob: enable it only after identifying transaction-record hotspots, and compare contention and latency metrics before and after the change. +- If enabling the setting does not improve the hotspot symptoms, or if it has unintended side effects, you can disable it again with: + + {% include_cached copy-clipboard.html %} + ~~~ sql + SET CLUSTER SETTING kv.transaction.randomized_anchor_key.enabled = false; + ~~~ + ### "Fail fast" method One way to reduce lock contention with writes is to use a "fail fast" method by using [SELECT FOR UPDATE ... NOWAIT]({% link {{ page.version.version }}/select-for-update.md %}#wait-policies) before the write. It can reduce or prevent failures late in a transaction's life (e.g. at the `COMMIT` time), by returning an error early in a contention situation if a row cannot be locked immediately. An example of this method is *Transaction 6* in [Example 2](#example-2): From ee10154cd64b0fd1662a0fafc4ec192835ed549c Mon Sep 17 00:00:00 2001 From: Rich Loveland Date: Tue, 24 Feb 2026 15:23:08 -0500 Subject: [PATCH 2/2] Update with angles-n-daemons feedback (1) --- .../v25.4/transaction-retry-error-reference.md | 15 ++++++++++++--- .../v26.1/transaction-retry-error-reference.md | 16 ++++++++++++---- 2 files changed, 24 insertions(+), 7 deletions(-) diff --git a/src/current/v25.4/transaction-retry-error-reference.md b/src/current/v25.4/transaction-retry-error-reference.md index 88082174364..e3b5bbfa6e0 100644 --- a/src/current/v25.4/transaction-retry-error-reference.md +++ b/src/current/v25.4/transaction-retry-error-reference.md @@ -75,6 +75,18 @@ Increase the chance that CockroachDB can [automatically retry]({% link {{ page.v {% include {{ page.version.version }}/performance/increase-server-side-retries.md %} +### Interpreting log messages + +In CockroachDB {{ page.version.version }}, the `meta={... key=/Table/...}` field that appears in log output for serialization conflicts identifies the transaction's [transaction record key]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#transaction-records) (also known as the _anchor key_), not necessarily the key where the conflict occurred. This anchor key is the first key written by the transaction and is where its record is stored. + +[Contention events]({% link {{ page.version.version }}/crdb-internal.md %}#view-all-contention-events) that are recorded when [`sql.contention.record_serialization_conflicts.enabled`]({% link {{ page.version.version }}/cluster-settings.md %}#setting-sql-contention-record-serialization-conflicts-enabled) is `true` use this anchor key when populating the recorded conflict. + +Only the following error types may add a conflicting key to a contention event: + +- `TransactionRetryError` +- `WriteTooOld` +- `ExclusionViolationError` + ## Transaction retry error reference Note that your application's retry logic does not need to distinguish between the different types of serialization errors. They are listed here for reference during [advanced troubleshooting]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). @@ -221,9 +233,6 @@ Under [`READ COMMITTED`]({% link {{ page.version.version }}/read-committed.md %} 1. `ReadWithinUncertaintyIntervalError` errors are only returned in rare cases that can be avoided by adjusting the [result buffer size](#result-buffer-size). -**Interpreting log messages:** - -In CockroachDB {{ page.version.version }}, the `meta={... key=/Table/...}` field that appears in log output for `ReadWithinUncertaintyIntervalError` and related serialization conflicts identifies the transaction's [transaction record (anchor) key]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#transaction-records), not necessarily the key where the conflict occurred. This anchor key is the first key written by the transaction and is where its record is stored. Contention events that are recorded when [`sql.contention.record_serialization_conflicts.enabled`]({% link {{ page.version.version }}/cluster-settings.md %}#setting-sql-contention-record-serialization-conflicts-enabled) is `true` use this anchor key when populating the recorded conflict. {{site.data.alerts.callout_info}} Uncertainty errors are a sign of transaction conflict. For more information about transaction conflicts, see [Transaction conflicts]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#transaction-conflicts). diff --git a/src/current/v26.1/transaction-retry-error-reference.md b/src/current/v26.1/transaction-retry-error-reference.md index 0279318a6de..52c8177fc44 100644 --- a/src/current/v26.1/transaction-retry-error-reference.md +++ b/src/current/v26.1/transaction-retry-error-reference.md @@ -75,6 +75,18 @@ Increase the chance that CockroachDB can [automatically retry]({% link {{ page.v {% include {{ page.version.version }}/performance/increase-server-side-retries.md %} +### Interpreting log messages + +In CockroachDB {{ page.version.version }}, the `meta={... key=/Table/...}` field that appears in log output for serialization conflicts identifies the transaction's [transaction record key]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#transaction-records) (also known as the _anchor key_), not necessarily the key where the conflict occurred. This anchor key is the first key written by the transaction and is where its record is stored. + +{% include_cached new-in.html version="v26.1" %} [Contention events]({% link {{ page.version.version }}/monitor-and-analyze-transaction-contention.md %}) that are recorded when [`sql.contention.record_serialization_conflicts.enabled`]({% link {{ page.version.version }}/cluster-settings.md %}#setting-sql-contention-record-serialization-conflicts-enabled) is `true` use the actual key where contention occurred (not the anchor key, as prior versions did) when populating the recorded conflict. + +Only the following error types may add a conflicting key to a contention event: + +- `TransactionRetryError` +- `WriteTooOld` +- `ExclusionViolationError` + ## Transaction retry error reference Note that your application's retry logic does not need to distinguish between the different types of serialization errors. They are listed here for reference during [advanced troubleshooting]({% link {{ page.version.version }}/performance-recipes.md %}#transaction-contention). @@ -221,10 +233,6 @@ Under [`READ COMMITTED`]({% link {{ page.version.version }}/read-committed.md %} 1. `ReadWithinUncertaintyIntervalError` errors are only returned in rare cases that can be avoided by adjusting the [result buffer size](#result-buffer-size). -**Interpreting log messages:** - -In CockroachDB {{ page.version.version }}, the `meta={... key=/Table/...}` field in log output for `ReadWithinUncertaintyIntervalError` and related serialization conflicts identifies the **actual contention key** (the key where the conflicting read or write occurred). Earlier versions could instead report the transaction's [anchor key]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#transaction-records), which made it harder to locate the true point of conflict. Contention events that are recorded when [`sql.contention.record_serialization_conflicts.enabled`]({% link {{ page.version.version }}/cluster-settings.md %}#setting-sql-contention-record-serialization-conflicts-enabled) is `true` use this contention key when populating the recorded conflict. - {{site.data.alerts.callout_info}} Uncertainty errors are a sign of transaction conflict. For more information about transaction conflicts, see [Transaction conflicts]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#transaction-conflicts). {{site.data.alerts.end}}