-
Notifications
You must be signed in to change notification settings - Fork 48
[26.1] Ordered rack/region preference for leader pinning #1598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: v-WIP/26.1
Are you sure you want to change the base?
Changes from all commits
3bc048c
8d05e38
8420053
c7d849a
83b4457
ad9aeda
99c36ef
364c1d8
1a1ec82
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -1,10 +1,21 @@ | ||||||||||||||
| = Leader Pinning | ||||||||||||||
| :description: Learn about leader pinning and how to configure a preferred partition leader location based on cloud availability zones or regions. | ||||||||||||||
| = Configure Leader Pinning | ||||||||||||||
| :description: Learn about Leader Pinning and how to configure a preferred partition leader location based on cloud availability zones or regions. | ||||||||||||||
| :page-topic-type: how-to | ||||||||||||||
| :personas: streaming_developer, platform_admin | ||||||||||||||
| :learning-objective-1: Configure preferred partition leader placement using rack labels | ||||||||||||||
| :learning-objective-2: Configure ordered rack preference for priority-based leader failover | ||||||||||||||
| :learning-objective-3: Identify conditions where Leader Pinning cannot place leaders in preferred racks | ||||||||||||||
| // tag::single-source[] | ||||||||||||||
|
|
||||||||||||||
| Produce requests that write data to Redpanda topics go through the topic partition leader, which syncs messages across its follower replicas. For a Redpanda cluster deployed across multiple availability zones (AZs), leader pinning ensures that a topic's partition leaders are geographically closer to clients, which helps decrease networking costs and guarantees lower latency. | ||||||||||||||
| Produce requests that write data to Redpanda topics route through the topic partition leader, which syncs messages across its follower replicas. For a Redpanda cluster deployed across multiple availability zones (AZs), Leader Pinning ensures that a topic's partition leaders are geographically closer to clients, which helps decrease networking costs and guarantees lower latency. | ||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, @kbatuigas can you add a glossterm for Leader Pinning and link to it from here? thx |
||||||||||||||
|
|
||||||||||||||
| If consumers are located in the same preferred region or AZ for leader pinning, and you have not set up xref:develop:consume-data/follower-fetching.adoc[follower fetching], leader pinning can also help reduce networking costs on consume requests. | ||||||||||||||
| If consumers are located in the same preferred region or AZ for Leader Pinning, and you have not set up xref:develop:consume-data/follower-fetching.adoc[follower fetching], Leader Pinning can also help reduce networking costs on consume requests. | ||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thinking maybe you should also mention Cloud Topics, as this is another way to reduce networking costs (at a slight expense to latency, but great for logs, data not reliant on speed). |
||||||||||||||
|
|
||||||||||||||
| After reading this page, you will be able to: | ||||||||||||||
|
|
||||||||||||||
| * [ ] {learning-objective-1} | ||||||||||||||
| * [ ] {learning-objective-2} | ||||||||||||||
| * [ ] {learning-objective-3} | ||||||||||||||
|
|
||||||||||||||
| ifndef::env-cloud[] | ||||||||||||||
| == Prerequisites | ||||||||||||||
|
|
@@ -14,61 +25,128 @@ ifndef::env-cloud[] | |||||||||||||
| include::shared:partial$enterprise-license.adoc[] | ||||||||||||||
| ==== | ||||||||||||||
|
|
||||||||||||||
| Before you can enable leader pinning, you must xref:manage:rack-awareness.adoc#configure-rack-awareness[configure rack awareness] on the cluster. If the config_ref:enable_rack_awareness,true,properties/cluster-properties[] cluster configuration property is set to `false`, leader pinning is disabled across the cluster. | ||||||||||||||
| Before you can enable Leader Pinning, you must xref:manage:rack-awareness.adoc#configure-rack-awareness[configure rack awareness] on the cluster. If the config_ref:enable_rack_awareness,true,properties/cluster-properties[] cluster configuration property is set to `false`, Leader Pinning is disabled across the cluster. | ||||||||||||||
|
|
||||||||||||||
| endif::[] | ||||||||||||||
|
|
||||||||||||||
| ifndef::env-cloud[] | ||||||||||||||
| == Configure leader pinning | ||||||||||||||
| == Set leader rack preferences | ||||||||||||||
|
|
||||||||||||||
| You can use both a topic configuration property and a cluster configuration property to configure leader pinning. | ||||||||||||||
| You can use both a topic configuration property and a cluster configuration property to configure Leader Pinning. | ||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this an either/or? You can configure Leader Pinning using either a topic configuration property or cluster configuration property? Or, do you have to configure both the topic and cluster configs?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is the explanation for this sentence in a separate paragraph? Maybe keep together so people won't ask the same questions I asked above? |
||||||||||||||
|
|
||||||||||||||
| You can set the topic configuration property for individual topics only, or set the cluster-wide configuration property that will enable leader pinning by default for all topics. You can also use a combination in which a default setting applies across the cluster, and you toggle the setting on or off for specific topics. | ||||||||||||||
| You can set the topic configuration property for individual topics only, or set the cluster-wide configuration property that enables Leader Pinning by default for all topics. You can also use a combination in which a default setting applies across the cluster, and you toggle the setting on or off for specific topics. | ||||||||||||||
|
|
||||||||||||||
| This configuration is based on the following scenario: you have Redpanda deployed in a multi-AZ or multi-region cluster, and you have configured each broker so that the config_ref:rack,true,properties/broker-properties[] configuration property contains racks corresponding to the AZs: | ||||||||||||||
|
|
||||||||||||||
| * Set the topic configuration property xref:reference:properties/topic-properties.adoc#redpandaleaderspreference[`redpanda.leaders.preference`]. The property accepts the following string values: | ||||||||||||||
| * Set the topic configuration property xref:reference:properties/topic-properties.adoc#redpanda-leaders-preference[`redpanda.leaders.preference`]. The property accepts the following string values: | ||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
| + | ||||||||||||||
| -- | ||||||||||||||
| ** `none`: Opt out the topic from leader pinning. | ||||||||||||||
| ** `none`: Disable Leader Pinning for the topic. | ||||||||||||||
| ** `racks:<rack1>[,<rack2>,...]`: Specify the preferred location (rack) of all topic partition leaders. The list can contain one or more racks, and you can list the racks in any order. Spaces in the list are ignored, for example: `racks:rack1,rack2` and `racks: rack1, rack2` are equivalent. You cannot specify empty racks, for example: `racks: rack1,,rack2`. If you specify multiple racks, Redpanda tries to distribute the partition leader locations equally across brokers in these racks. | ||||||||||||||
| ** `ordered_racks:<rack1>[,<rack2>,...]`: (Supported in Redpanda version 26.1 or later) Specify the preferred racks in priority order. Redpanda places leaders in the first listed rack when available, failing over to each subsequent rack when higher-priority racks are unavailable. If all listed racks are unavailable, leaders fall back to any other available brokers. Brokers with no rack assignment are treated as lowest priority. | ||||||||||||||
| + | ||||||||||||||
| To find the rack identifier, run `rpk cluster info`. | ||||||||||||||
| Use `ordered_racks` for multi-region deployments with a primary region for leaders and explicit failover to a disaster recovery site. | ||||||||||||||
| -- | ||||||||||||||
| + | ||||||||||||||
| This property inherits the default value from the cluster property `default_leaders_preference`. | ||||||||||||||
| + | ||||||||||||||
| To find the rack identifiers of all brokers, run: | ||||||||||||||
| + | ||||||||||||||
| [,bash] | ||||||||||||||
| ---- | ||||||||||||||
| rpk cluster info | ||||||||||||||
| ---- | ||||||||||||||
| + | ||||||||||||||
| To set the topic property: | ||||||||||||||
| + | ||||||||||||||
| [,bash] | ||||||||||||||
| ---- | ||||||||||||||
| rpk topic alter-config <topic-name> --set redpanda.leaders.preference=ordered_racks:<rack1>,<rack2> | ||||||||||||||
| ---- | ||||||||||||||
|
Comment on lines
51
to
+65
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||||||||||||||
|
|
||||||||||||||
| * Set the cluster configuration property config_ref:default_leaders_preference,true,properties/cluster-properties[], which specifies the default leader pinning configuration for all topics that don’t have `redpanda.leaders.preference` explicitly set. It accepts values in the same format as `redpanda.leaders.preference`. Default: `none` | ||||||||||||||
| * Set the cluster configuration property config_ref:default_leaders_preference,true,properties/cluster-properties[], which specifies the default Leader Pinning configuration for all topics that don’t have `redpanda.leaders.preference` explicitly set. It accepts values in the same format as `redpanda.leaders.preference`. Default: `none` | ||||||||||||||
| + | ||||||||||||||
| This property also affects internal topics, such as `__consumer_offsets` and transaction coordinators. All offset tracking and transaction coordination requests get placed within the preferred regions or AZs for all clients, so you see end-to-end latency and networking cost benefits. | ||||||||||||||
| + | ||||||||||||||
| To set the cluster property: | ||||||||||||||
| + | ||||||||||||||
| [,bash] | ||||||||||||||
| ---- | ||||||||||||||
| rpk cluster config set default_leaders_preference ordered_racks:<rack1>,<rack2> | ||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hmmm, why is this set/formatted differently from |
||||||||||||||
| ---- | ||||||||||||||
|
|
||||||||||||||
| If there is more than one broker in the preferred AZ (or AZs), leader pinning distributes partition leaders uniformly across brokers in the AZ. | ||||||||||||||
| If there is more than one broker in the preferred AZ (or AZs), Leader Pinning distributes partition leaders uniformly across brokers in the AZ. | ||||||||||||||
|
|
||||||||||||||
| endif::[] | ||||||||||||||
|
|
||||||||||||||
| ifdef::env-cloud[] | ||||||||||||||
| == Configure leader pinning | ||||||||||||||
| == Set leader rack preferences | ||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||||||||||||||
|
|
||||||||||||||
| Configure leader pinning if you have Redpanda deployed in a multi-AZ or multi-region cluster and your ingress is concentrated in a particular AZ or region. | ||||||||||||||
| Configure Leader Pinning if you have Redpanda deployed in a multi-AZ or multi-region cluster and your ingress is concentrated in a particular AZ or region. | ||||||||||||||
|
|
||||||||||||||
| Use the topic configuration property `redpanda.leaders.preference` to configure leader pinning for individual topics. The property accepts the following string values: | ||||||||||||||
| Use the topic configuration property `redpanda.leaders.preference` to configure Leader Pinning for individual topics. The property accepts the following string values: | ||||||||||||||
|
|
||||||||||||||
| ** `none`: Opt out the topic from leader pinning. | ||||||||||||||
| ** `none`: Disable Leader Pinning for the topic. | ||||||||||||||
| ** `racks:<rack1>[,<rack2>,...]`: Specify the preferred location (rack) of all topic partition leaders. The list can contain one or more racks, and you can list the racks in any order. Spaces in the list are ignored, for example: `racks:rack1,rack2` and `racks: rack1, rack2` are equivalent. You cannot specify empty racks, for example: `racks: rack1,,rack2`. If you specify multiple racks, Redpanda tries to distribute the partition leader locations equally across brokers in these racks. | ||||||||||||||
| ** `ordered_racks:<rack1>[,<rack2>,...]`: (Supported in Redpanda version 26.1 or later) Specify the preferred racks in priority order. Redpanda places leaders in the first listed rack when available, failing over to each subsequent rack when higher-priority racks are unavailable. If all listed racks are unavailable, leaders fall back to any other available brokers. Brokers with no rack assignment are treated as lowest priority. | ||||||||||||||
| + | ||||||||||||||
| To find the rack identifiers of all brokers, run: | ||||||||||||||
| + | ||||||||||||||
| [,bash] | ||||||||||||||
| ---- | ||||||||||||||
| rpk cluster info | ||||||||||||||
| ---- | ||||||||||||||
| + | ||||||||||||||
| To set the topic property: | ||||||||||||||
| + | ||||||||||||||
| To find the rack identifier, run `rpk cluster info`. | ||||||||||||||
| [,bash] | ||||||||||||||
| ---- | ||||||||||||||
| rpk topic alter-config <topic-name> --set redpanda.leaders.preference=ordered_racks:<rack1>,<rack2> | ||||||||||||||
| ---- | ||||||||||||||
|
|
||||||||||||||
| If there is more than one broker in the preferred AZ (or AZs), leader pinning distributes partition leaders uniformly across brokers in the AZ. | ||||||||||||||
| If there is more than one broker in the preferred AZ (or AZs), Leader Pinning distributes partition leaders uniformly across brokers in the AZ. | ||||||||||||||
|
|
||||||||||||||
| endif::[] | ||||||||||||||
|
|
||||||||||||||
| == Leader pinning failover across availability zones | ||||||||||||||
| == Limitations | ||||||||||||||
|
|
||||||||||||||
| Leader Pinning controls which replica is elected as leader, and does not move replicas to different brokers. If all of a topic's replicas are on brokers in non-preferred racks, no replica exists in the preferred racks to elect as leader, and Redpanda may elect a non-preferred leader indefinitely. | ||||||||||||||
|
|
||||||||||||||
| For example, consider a cluster deployed across four racks (A, B, C, D) with Leader Pinning configured as `ordered_racks:A,B,C,D`. With a replication factor of 3, rack awareness can only place replicas in three of the four racks. If the highest-priority rack (A) does not receive a replica, no replica exists there to elect as leader, and Redpanda may elect a non-preferred leader indefinitely. | ||||||||||||||
|
|
||||||||||||||
| ifndef::env-cloud[] | ||||||||||||||
| To prevent this scenario: | ||||||||||||||
|
|
||||||||||||||
| * Enable config_ref:enable_rack_awareness,true,properties/cluster-properties[`enable_rack_awareness`] to distribute replicas across racks automatically. | ||||||||||||||
| * Ensure the topic's replication factor at least equals the total number of racks in the cluster, so every rack, including the highest-priority rack, receives a replica. | ||||||||||||||
| * If needed, manually reassign replicas to ensure the highest-priority rack receives one. Note that the partition balancer may move replicas again after manual reassignment. | ||||||||||||||
|
|
||||||||||||||
| endif::[] | ||||||||||||||
| ifdef::env-cloud[] | ||||||||||||||
| To prevent this scenario, ensure the topic's replication factor at least equals the total number of racks in the cluster, so every rack, including the highest-priority rack, receives a replica. | ||||||||||||||
kbatuigas marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||
|
|
||||||||||||||
| endif::[] | ||||||||||||||
|
|
||||||||||||||
| == Leader Pinning failover across availability zones | ||||||||||||||
|
|
||||||||||||||
| If there are three AZs: A, B, and C, and A becomes unavailable, the failover behavior with `racks` is as follows: | ||||||||||||||
|
|
||||||||||||||
| * A topic with `A` as the preferred leader AZ will have its partition leaders uniformly distributed across B and C. | ||||||||||||||
| * A topic with `A,B` as the preferred leader AZs will have its partition leaders in B. | ||||||||||||||
| * A topic with `B` as the preferred leader AZ will have its partition leaders in B as well. | ||||||||||||||
|
Comment on lines
+134
to
+136
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At first it was confusing to see the init "A"...especially since it is also the name of one of the AZs. |
||||||||||||||
|
|
||||||||||||||
| === Failover with ordered rack preference | ||||||||||||||
|
|
||||||||||||||
| With `ordered_racks`, the failover order follows the configured priority list. Leaders move to the next available rack in the list when higher-priority racks become unavailable. | ||||||||||||||
|
|
||||||||||||||
| For a topic configured with `ordered_racks:A,B,C`: | ||||||||||||||
|
|
||||||||||||||
| If there are three AZs: A, B, and C, and A becomes unavailable, the failover behavior is as follows: | ||||||||||||||
| * A topic with `A` as the first-priority rack will have its partition leaders in A. | ||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
| * If A becomes unavailable, leaders move to B. | ||||||||||||||
| * If A and B become unavailable, leaders move to C. | ||||||||||||||
| * If A, B, and C all become unavailable, leaders fall back to any available brokers. | ||||||||||||||
kbatuigas marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||
|
|
||||||||||||||
| * A topic with "A" as the preferred leader AZ will have its partition leaders uniformly distributed across B and C. | ||||||||||||||
| * A topic with "A,B" as the preferred leader AZs will have its partition leaders in B. | ||||||||||||||
| * A topic with “B” as the preferred leader AZ will have its partition leaders in B as well. | ||||||||||||||
| If a higher-priority rack recovers and the topic's replication factor ensures that rack receives a replica, Redpanda automatically moves leaders back to the highest available preferred rack. | ||||||||||||||
|
|
||||||||||||||
| == Suggested reading | ||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add Cloud Topics? |
||||||||||||||
|
|
||||||||||||||
|
|
||||||||||||||



There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we'll also need to add these attributes to https://github.com/redpanda-data/cloud-docs/blob/main/modules/develop/pages/produce-data/leader-pinning.adoc cc @micheleRP