DOC-1600 Document feature Fix cross region RRRs on AWS#1595
DOC-1600 Document feature Fix cross region RRRs on AWS#1595
Conversation
✅ Deploy Preview for redpanda-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis change updates documentation for Remote Read Replica (RRR) topics in the Redpanda platform. It expands AWS cross-region scenario support by introducing new prerequisites requiring Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
@mattschumpert please confirm that we're making edit |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
modules/manage/partials/remote-read-replicas.adoc (2)
482-482: Clarify "unlimited" buckets statement.The phrase "unlimited number of buckets" could be misleading. While it's true that the bucket name isn't part of the upstream key for
pathstyle, readers might overlook that you're still limited to 10 distinct region/endpoint combinations total. Consider rephrasing for clarity.📝 Suggested rewording
-- `path`: Each unique combination of region and endpoint counts as one upstream (the bucket name is not part of the key). You can create cross-region Remote Read Replica topics for an unlimited number of buckets, as long as those buckets are spread across no more than 10 distinct region/endpoint combinations. +- `path`: Each unique combination of region and endpoint counts as one upstream (the bucket name is not part of the key). You can create cross-region Remote Read Replica topics for multiple buckets using the same region/endpoint combination, with a maximum of 10 distinct region/endpoint combinations per cluster.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/manage/partials/remote-read-replicas.adoc` at line 482, Reword the sentence about `path` style to remove the misleading "unlimited number of buckets" phrasing and clearly state the real constraint: while bucket names do not count toward the upstream key for `path`, you are limited to at most 10 distinct region/endpoint combinations total; update the sentence that begins with "`path`: Each unique combination of region and endpoint counts as one upstream..." to make this explicit and unambiguous.
473-473: Clarify the note to cover both URL styles.The note only explains
virtual_hostbehavior, but line 447 indicates thatcloud_storage_url_stylecan be set to eithervirtual_hostorpath. Since both styles are supported, the note should clarify behavior for both or be more generic to avoid confusion.📋 Suggested clarification
-NOTE: The `endpoint` value must not include the bucket name. Redpanda automatically prepends the bucket name when using `virtual_host` URL style. +NOTE: The `endpoint` value must not include the bucket name. When using `virtual_host` URL style, Redpanda automatically prepends the bucket name to the endpoint. When using `path` URL style, Redpanda appends the bucket name as a path segment.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/manage/partials/remote-read-replicas.adoc` at line 473, Update the NOTE text about `endpoint` to cover both `cloud_storage_url_style` options: explain that when `cloud_storage_url_style` is set to `virtual_host` Redpanda will prepend the bucket name to the `endpoint`, whereas when set to `path` the `endpoint` must include the bucket name (or state that the bucket is part of the path), and make the wording generic so it references the `endpoint` and the `cloud_storage_url_style` values (`virtual_host` and `path`) to avoid ambiguity; modify the sentence around the existing NOTE to explicitly mention both styles and their expected `endpoint` formats.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modules/manage/partials/remote-read-replicas.adoc`:
- Around line 39-40: Replace the awkward phrase "in the same region or a
different region as" in both AWS and GCP sentences with a grammatically correct
construction such as "in the same region as or a different region than" (or
"...or a different region from"); update the two lines that currently read "**
AWS: The remote cluster can be in the same region or a different region as the
origin cluster's S3 bucket. For cross-region Remote Read Replica topics, see
<<create-cross-region-rrr-topic>>." and "** GCP: The remote cluster can be in
the same region or a different region as the bucket/container." to use "in the
same region as or a different region than" (or "...from") for clarity.
- Line 59: Edit the sentence starting "Create a remote cluster for the Remote
Read Replica topic..." to fix the grammatical construction by changing "in the
same or a different region as the bucket/container" and "in the same or a
different region" to a consistent form such as "in the same region as or a
different region from the bucket/container" (and similarly for AWS: "in the same
region as or a different region from, but cross-region Remote Read Replica
topics require additional configuration"). Update both the GCP and AWS clauses
to match the phrasing used on lines 39–40 for consistency.
---
Nitpick comments:
In `@modules/manage/partials/remote-read-replicas.adoc`:
- Line 482: Reword the sentence about `path` style to remove the misleading
"unlimited number of buckets" phrasing and clearly state the real constraint:
while bucket names do not count toward the upstream key for `path`, you are
limited to at most 10 distinct region/endpoint combinations total; update the
sentence that begins with "`path`: Each unique combination of region and
endpoint counts as one upstream..." to make this explicit and unambiguous.
- Line 473: Update the NOTE text about `endpoint` to cover both
`cloud_storage_url_style` options: explain that when `cloud_storage_url_style`
is set to `virtual_host` Redpanda will prepend the bucket name to the
`endpoint`, whereas when set to `path` the `endpoint` must include the bucket
name (or state that the bucket is part of the path), and make the wording
generic so it references the `endpoint` and the `cloud_storage_url_style` values
(`virtual_host` and `path`) to avoid ambiguity; modify the sentence around the
existing NOTE to explicitly mention both styles and their expected `endpoint`
formats.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
modules/manage/partials/remote-read-replicas.adoc
|
@micheleRP @mattschumpert for cloud we should probably hard code |
nvartolomei
left a comment
There was a problem hiding this comment.
Looks good. Minor comments.
Co-authored-by: Nicolae Vartolomei <nv@redpanda.com>
|
|
||
| ==== Prerequisites | ||
|
|
||
| The xref:reference:properties/object-storage-properties.adoc#cloud_storage_url_style[`cloud_storage_url_style`] cluster property must be set explicitly to `virtual_host` or `path` on the remote cluster. The default value does not support cross-region Remote Read Replicas. |
There was a problem hiding this comment.
@nvartolomei Quite surprised to see this. if both styles work, wonder why we force the user to set this explicitly? Very importantly, I don't think this cluster property is settable in Redpanda Cloud (which is where we mostly care about this feature), because how the underlying S3 urls are is not meant to be fiddled with in cloud. Please follow up with the Cloud team to make sure RP CLoud sets this to the style such that this WILL work. In that case, the Cloud docs could omit this step. Otherwise ,we have an issue for cloud
There was a problem hiding this comment.
@mattschumpert
... why we force the user to set this explicitly? ...
To cut weeks of effort on core side.
There was a problem hiding this comment.
@nvartolomei, @mattschumpert Please confirm if we're going to make cloud_storage_url_style configurable in Redpanda Cloud, or if we can conditionalize this Prereqs section out of Cloud docs
|
|
||
| ==== Limits | ||
|
|
||
| Each unique combination of region and endpoint creates a dynamic object storage upstream on the remote cluster. A cluster supports a maximum of 10 active dynamic upstreams. |
There was a problem hiding this comment.
@micheleRP needs rewording
Not sure what an 'upstream' means. @nvartolomei does this mean 'creates unique configuration data on the remote read replica cluster'. (it is the RRR cluster, right?)
There was a problem hiding this comment.
Yeah. :/ Upstreams might be a confusing term for users. They probably shouldn't know about it. @micheleRP can we rephrase this to avoid introducing new technical terms/jargon?
Or, maybe introduce what upstream is.
There was a problem hiding this comment.
sure, replaced "upstream" with "connection": please review!
There was a problem hiding this comment.
Unfortunately it is not connection. Under the hood we create many more connections actually. ChatGPT suggested target :/
There was a problem hiding this comment.
sounds good, switched to "target"
Description
This pull request updates the documentation for Remote Read Replica topics to clarify support for cross-region deployments on AWS and provide instructions for their setup.
Create a cross-region Remote Read Replica topic on AWS, with step-by-step instructions for creating RRR topics when the remote cluster is in a different AWS region than the origin cluster's S3 bucket. This includes details on required cluster properties, topic creation commands, and placeholder replacements.cloud_storage_url_styleproperty, including differences betweenvirtual_hostandpathstyles.Resolves https://redpandadata.atlassian.net/browse/DOC-1600)
Review deadline: March 9
Page previews
Remote Read Replicas
Checks