diff --git a/modules/ROOT/nav.adoc b/modules/ROOT/nav.adoc index 5eb34695dc..7312f19fce 100644 --- a/modules/ROOT/nav.adoc +++ b/modules/ROOT/nav.adoc @@ -163,6 +163,7 @@ *** xref:manage:audit-logging.adoc[Audit Logging] **** xref:manage:audit-logging/audit-log-samples.adoc[Sample Audit Log Messages] *** xref:manage:cluster-maintenance/disk-utilization.adoc[] +*** xref:manage:cluster-maintenance/about-throughput-quotas.adoc[] *** xref:manage:cluster-maintenance/manage-throughput.adoc[Manage Throughput] *** xref:manage:cluster-maintenance/compaction-settings.adoc[Compaction Settings] *** xref:manage:cluster-maintenance/configure-client-connections.adoc[] diff --git a/modules/get-started/pages/release-notes/redpanda.adoc b/modules/get-started/pages/release-notes/redpanda.adoc index 34765f5ded..d2f378a46d 100644 --- a/modules/get-started/pages/release-notes/redpanda.adoc +++ b/modules/get-started/pages/release-notes/redpanda.adoc @@ -7,171 +7,8 @@ This topic includes new content added in version {page-component-version}. For a * xref:redpanda-cloud:get-started:whats-new-cloud.adoc[] * xref:redpanda-cloud:get-started:cloud-overview.adoc#redpanda-cloud-vs-self-managed-feature-compatibility[Redpanda Cloud vs Self-Managed feature compatibility] -NOTE: Redpanda v25.3 introduces breaking schema changes for Iceberg topics. If you are using Iceberg topics and want to retain the data in the corresponding Iceberg tables, review xref:upgrade:iceberg-schema-changes-and-migration-guide.adoc[] before upgrading your cluster, and follow the required migration steps to avoid sending new records to a dead-letter queue table. +== User-based throughput quotas -== Iceberg topics with GCP BigLake +Redpanda now supports throughput quotas based on authenticated user principals. Unlike client-based quotas (which rely on self-declared `client-id` values), user-based quotas enforce limits using verified identities from SASL, mTLS, or OIDC authentication. -A new xref:manage:iceberg/iceberg-topics-gcp-biglake.adoc[REST catalog integration] with Google Cloud BigLake allows you to add Redpanda topics as Iceberg tables in your data lakehouse. - -See xref:manage:iceberg/use-iceberg-catalogs.adoc[] for details on configuring Iceberg REST catalog integrations with Redpanda. - -== Shadowing - -Redpanda v25.3 introduces xref:deploy:redpanda/manual/disaster-recovery/shadowing/index.adoc[], an enterprise-licensed disaster recovery solution that provides asynchronous, offset-preserving replication between distinct Redpanda clusters. Shadowing enables cross-region data protection by replicating topic data, configurations, consumer group offsets, ACLs, and Schema Registry data with byte-level fidelity. - -The shadow cluster operates in read-only mode while continuously receiving updates from the source cluster. During a disaster, you can failover individual topics or an entire shadow link to make resources fully writable for production traffic. See xref:deploy:redpanda/manual/disaster-recovery/shadowing/failover-runbook.adoc[] for emergency procedures. - -Shadowing includes comprehensive metrics for monitoring replication health. See xref:manage:disaster-recovery/shadowing/monitor.adoc[] and xref:reference:public-metrics-reference.adoc#shadow-link-metrics[Shadow Link metrics reference]. - -== Connected client monitoring - -You can view details about Kafka client connections using `rpk` or the Admin API ListKafkaConnections endpoint. This allows you to view detailed information about active client connections on a cluster, and identify and troubleshoot problematic clients. For more information, see the xref:manage:cluster-maintenance/manage-throughput.adoc#view-connected-client-details[connected client details] example in the Manage Throughput guide. - -== New Admin API style - -Redpanda v25.3 introduces a new API style for the Admin API, powered by https://connectrpc.com/docs/introduction[ConnectRPC]. New Redpanda features and operations in v25.3 are available as ConnectRPC services, allowing you to use autogenerated Protobuf clients in addition to using HTTP clients such as `curl`. - -Use the new ConnectRPC endpoints with the following v25.3 features: - -* Shadowing -* Connected client monitoring - -Existing Admin API endpoints from versions earlier than 25.3 remain supported, and you can continue to use them as usual. See xref:manage:use-admin-api.adoc[Manage Redpanda with the Admin API] to learn more about Admin API, and the link:/api/doc/admin/v2/[Admin API reference] to view the new endpoints. - -== Schema Registry import mode - -Redpanda Schema Registry now supports an import mode that allows you to import existing schemas and retain their current IDs and version numbers. Import mode is useful when migrating from another schema registry. - -Starting with this release, import mode must be used when importing schemas. Read-write mode no longer allows specifying a schema ID and version when registering a schema. -See xref:manage:schema-reg/schema-reg-api.adoc#set-schema-registry-mode[Use the Schema Registry API]. - -== Security report - -You can now generate a security report for your Redpanda cluster using the link:/api/doc/admin/operation/operation-get_security_report[`/v1/security/report`] Admin API endpoint. The report provides detailed information about TLS configuration, authentication methods, authorization status, and security alerts across all Redpanda interfaces (Kafka, RPC, Admin, Schema Registry, HTTP Proxy). - -== Topic identifiers - -Redpanda v25.3 implements topic identifiers using 16 byte UUIDs as proposed in https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers[KIP-516^]. - -== Shadowing metrics - -Redpanda v25.3 introduces comprehensive xref:reference:public-metrics-reference.adoc#shadow-link-metrics[Shadowing metrics] for monitoring disaster recovery replication: - -* xref:reference:public-metrics-reference.adoc#redpanda_shadow_link_client_errors[`redpanda_shadow_link_client_errors`] - Track Kafka client errors during shadow link operations -* xref:reference:public-metrics-reference.adoc#redpanda_shadow_link_shadow_lag[`redpanda_shadow_link_shadow_lag`] - Monitor replication lag between source and shadow partitions -* xref:reference:public-metrics-reference.adoc#redpanda_shadow_link_shadow_topic_state[`redpanda_shadow_link_shadow_topic_state`] - Track shadow topic state distribution across links -* xref:reference:public-metrics-reference.adoc#redpanda_shadow_link_total_bytes_fetched[`redpanda_shadow_link_total_bytes_fetched`] - Monitor data transfer volume from source cluster -* xref:reference:public-metrics-reference.adoc#redpanda_shadow_link_total_bytes_written[`redpanda_shadow_link_total_bytes_written`] - Track data written to shadow cluster -* xref:reference:public-metrics-reference.adoc#redpanda_shadow_link_total_records_fetched[`redpanda_shadow_link_total_records_fetched`] - Monitor total records fetched from source cluster -* xref:reference:public-metrics-reference.adoc#redpanda_shadow_link_total_records_written[`redpanda_shadow_link_total_records_written`] - Track total messages written to shadow cluster - -For monitoring guidance and alert recommendations, see xref:manage:disaster-recovery/shadowing/monitor.adoc[]. - -== New commands - -Redpanda v25.3 introduces the following xref:reference:rpk/rpk-shadow/rpk-shadow.adoc[`rpk shadow`] commands for managing Redpanda shadow links: - -* xref:reference:rpk/rpk-shadow/rpk-shadow-config-generate.adoc[`rpk shadow config generate`] - Generate configuration files for shadow links -* xref:reference:rpk/rpk-shadow/rpk-shadow-create.adoc[`rpk shadow create`] - Create new shadow links -* xref:reference:rpk/rpk-shadow/rpk-shadow-update.adoc[`rpk shadow update`] - Update existing shadow link configurations -* xref:reference:rpk/rpk-shadow/rpk-shadow-list.adoc[`rpk shadow list`] - List all shadow links -* xref:reference:rpk/rpk-shadow/rpk-shadow-describe.adoc[`rpk shadow describe`] - View shadow link configuration details -* xref:reference:rpk/rpk-shadow/rpk-shadow-status.adoc[`rpk shadow status`] - Monitor shadow link replication status -* xref:reference:rpk/rpk-shadow/rpk-shadow-failover.adoc[`rpk shadow failover`] - Perform emergency failover operations -* xref:reference:rpk/rpk-shadow/rpk-shadow-delete.adoc[`rpk shadow delete`] - Delete shadow links - -In addition, the following commands have been added: - -* xref:reference:rpk/rpk-cluster/rpk-cluster-connections.adoc[`rpk cluster connections`] - Monitor cluster connections and client statistics. -* xref:reference:rpk/rpk-redpanda/rpk-redpanda-config-print.adoc[`rpk redpanda config print`] - Display node configuration. - -== New configuration properties - -Redpanda 25.3 introduces the following configuration properties: - -**Shadowing:** - -* xref:reference:properties/cluster-properties.adoc#enable_shadow_linking[`enable_shadow_linking`]: Enable shadow links (Enterprise license required) - -**Timestamp validation:** - -* xref:reference:properties/cluster-properties.adoc#log_message_timestamp_after_max_ms[`log_message_timestamp_after_max_ms`]: Maximum timestamp difference for future records -* xref:reference:properties/cluster-properties.adoc#log_message_timestamp_before_max_ms[`log_message_timestamp_before_max_ms`]: Maximum timestamp difference for past records -* xref:reference:properties/topic-properties.adoc#messagetimestampaftermaxms[`message.timestamp.after.max.ms`]: Topic-level timestamp validation (future) -* xref:reference:properties/topic-properties.adoc#messagetimestampbeforemaxms[`message.timestamp.before.max.ms`]: Topic-level timestamp validation (past) - -**Audit logging:** - -* xref:reference:properties/cluster-properties.adoc#audit_use_rpc[`audit_use_rpc`]: Use internal RPCs for audit logging - -**Object storage:** - -* xref:reference:properties/object-storage-properties.adoc#cloud_storage_client_lease_timeout_ms[`cloud_storage_client_lease_timeout_ms`]: Object storage connection timeout - -**Iceberg:** - -* xref:reference:properties/cluster-properties.adoc#iceberg_default_catalog_namespace[`iceberg_default_catalog_namespace`]: Default Iceberg catalog namespace for tables -* xref:reference:properties/cluster-properties.adoc#iceberg_dlq_table_suffix[`iceberg_dlq_table_suffix`]: Iceberg DLQ table name suffix -* xref:reference:properties/cluster-properties.adoc#iceberg_rest_catalog_gcp_user_project[`iceberg_rest_catalog_gcp_user_project`]: GCP project for Iceberg REST catalog billing -* xref:reference:properties/cluster-properties.adoc#iceberg_topic_name_dot_replacement[`iceberg_topic_name_dot_replacement`]: Dot replacement in Iceberg table names - -**TLS:** - -* xref:reference:properties/cluster-properties.adoc#tls_v1_2_cipher_suites[`tls_v1_2_cipher_suites`]: TLS 1.2 cipher suites for client connections -* xref:reference:properties/cluster-properties.adoc#tls_v1_3_cipher_suites[`tls_v1_3_cipher_suites`]: TLS 1.3 cipher suites for client connections - -**Tiered Storage:** - -* xref:reference:properties/cluster-properties.adoc#cloud_topics_epoch_service_epoch_increment_interval[`cloud_topics_epoch_service_epoch_increment_interval`]: Cluster epoch increment interval -* xref:reference:properties/cluster-properties.adoc#cloud_topics_epoch_service_local_epoch_cache_duration[`cloud_topics_epoch_service_local_epoch_cache_duration`]: Local epoch cache duration -* xref:reference:properties/cluster-properties.adoc#cloud_topics_short_term_gc_backoff_interval[`cloud_topics_short_term_gc_backoff_interval`]: Short-term garbage collection backoff interval -* xref:reference:properties/cluster-properties.adoc#cloud_topics_short_term_gc_interval[`cloud_topics_short_term_gc_interval`]: Short-term garbage collection interval -* xref:reference:properties/cluster-properties.adoc#cloud_topics_short_term_gc_minimum_object_age[`cloud_topics_short_term_gc_minimum_object_age`]: Minimum object age for garbage collection - -**Other configuration:** - -* xref:reference:properties/cluster-properties.adoc#controller_backend_reconciliation_concurrency[`controller_backend_reconciliation_concurrency`]: Maximum concurrent controller reconciliation operations -* xref:reference:properties/cluster-properties.adoc#fetch_max_read_concurrency[`fetch_max_read_concurrency`]: Maximum concurrent partition reads per fetch request -* xref:reference:properties/cluster-properties.adoc#kafka_max_message_size_upper_limit_bytes[`kafka_max_message_size_upper_limit_bytes`]: Maximum allowed `max.message.size` topic property value -* xref:reference:properties/cluster-properties.adoc#kafka_produce_batch_validation[`kafka_produce_batch_validation`]: Validation level for produced batches -* xref:reference:properties/cluster-properties.adoc#log_compaction_tx_batch_removal_enabled[`log_compaction_tx_batch_removal_enabled`]: Enable transactional batch removal during compaction -* xref:reference:properties/cluster-properties.adoc#sasl_mechanisms_overrides[`sasl_mechanisms_overrides`]: SASL authentication mechanisms per listener - -=== Changes to default values - -The following configuration properties have new default values in v25.3: - -* xref:reference:properties/cluster-properties.adoc#core_balancing_continuous[`core_balancing_continuous`]: Changed from `false` to `true` (Enterprise license required). -* xref:reference:properties/cluster-properties.adoc#partition_autobalancing_mode[`partition_autobalancing_mode`]: Changed from `node_add` to `continuous` (Enterprise license required). -* xref:reference:properties/cluster-properties.adoc#iceberg_throttle_backlog_size_ratio[`iceberg_throttle_backlog_size_ratio`]: Changed from `0.3` to `null`. - -[[behavior-changes]] -=== Behavior changes - -The following topic properties now support enhanced tristate behavior: - -* xref:reference:properties/topic-properties.adoc#segment-ms[`segment.ms`] -* xref:reference:properties/topic-properties.adoc#retention-bytes[`retention.bytes`] -* xref:reference:properties/topic-properties.adoc#retention-ms[`retention.ms`] -* xref:reference:properties/topic-properties.adoc#retention-local-target-bytes[`retention.local.target.bytes`] -* xref:reference:properties/topic-properties.adoc#retention-local-target-ms[`retention.local.target.ms`] -* xref:reference:properties/topic-properties.adoc#initial-retention-local-target-bytes[`initial.retention.local.target.bytes`] -* xref:reference:properties/topic-properties.adoc#initial-retention-local-target-ms[`initial.retention.local.target.ms`] -* xref:reference:properties/topic-properties.adoc#delete-retention-ms[`delete.retention.ms`] -* xref:reference:properties/topic-properties.adoc#min-cleanable-dirty-ratio[`min.cleanable.dirty.ratio`] - -Previously, these properties treated zero and negative values the same way. Now they support three distinct states: positive values set specific limits, zero provides immediate eligibility for cleanup/compaction, and negative values disable the feature entirely. Review your topic configurations if you currently use zero values for these properties. - -=== Deprecations - -The following configuration properties have been deprecated in v25.3 and will be removed in a future release: - -* `kafka_memory_batch_size_estimate_for_fetch`: No replacement. Remove from configuration. -* `log_compaction_disable_tx_batch_removal`: Use xref:reference:properties/cluster-properties.adoc#log_compaction_tx_batch_removal_enabled[`log_compaction_tx_batch_removal_enabled`] instead. Note the inverted logic: the new property enables the behavior when set to `true`. -* `log_message_timestamp_alert_after_ms`: Use xref:reference:properties/cluster-properties.adoc#log_message_timestamp_after_max_ms[`log_message_timestamp_after_max_ms`] instead. -* `log_message_timestamp_alert_before_ms`: Use xref:reference:properties/cluster-properties.adoc#log_message_timestamp_before_max_ms[`log_message_timestamp_before_max_ms`] instead. -* `raft_recovery_default_read_size`: No replacement. Remove from configuration. - -== Deprecated features - -Redpanda has deprecated support for specific TLSv1.2 and TLSv1.3 cipher suites and now uses more secure defaults. See xref:upgrade:deprecated/index.adoc[Deprecated Features] for the complete list. +You can set quotas for individual users, default users, or fine-grained user/client combinations. See xref:manage:cluster-maintenance/about-throughput-quotas.adoc[] for conceptual details, and xref:manage:cluster-maintenance/manage-throughput.adoc#set-user-based-quotas[Set user-based quotas] to get started. diff --git a/modules/manage/pages/cluster-maintenance/about-throughput-quotas.adoc b/modules/manage/pages/cluster-maintenance/about-throughput-quotas.adoc new file mode 100644 index 0000000000..ebb13c63c7 --- /dev/null +++ b/modules/manage/pages/cluster-maintenance/about-throughput-quotas.adoc @@ -0,0 +1,281 @@ += About Client Throughput Quotas +:description: Understand how Redpanda's user-based and client ID-based throughput quotas work, including entity hierarchy, precedence rules, and quota tracking behavior. +:page-topic-type: concepts +:page-aliases: +:personas: platform_admin, developer +:learning-objective-1: Describe the difference between user-based and client ID-based quotas +:learning-objective-2: Determine which quota type to use for your use case +:learning-objective-3: Explain quota precedence rules and how Redpanda tracks quota usage + +// tag::single-source[] +ifdef::env-cloud[] +:authentication-doc: security:cloud-authentication.adoc +endif::[] +ifndef::env-cloud[] +:authentication-doc: manage:security/authentication.adoc +endif::[] + +Redpanda uses throughput quotas to limit the rate of produce and consume requests from clients. Understanding how quotas work helps you prevent individual clients from disproportionately consuming resources and causing performance degradation for other clients (also known as the "noisy-neighbor" problem), and ensure fair resource sharing across users and applications. + +After reading this page, you will be able to: + +* [ ] {learning-objective-1} +* [ ] {learning-objective-2} +* [ ] {learning-objective-3} + +To configure and manage throughput quotas, see xref:manage:cluster-maintenance/manage-throughput.adoc[]. + +== Throughput control overview + +Redpanda provides two ways to control throughput: + +* Broker-wide limits: Configured using cluster properties. For details, see xref:manage:cluster-maintenance/manage-throughput.adoc#broker-wide-throughput-limits[Broker-wide throughput limits]. +* Client throughput quotas: Configured using the Kafka API. Client quotas enable per-user and per-client rate limiting with fine-grained control through entity hierarchy and precedence rules. This page focuses on client quotas. + +== Supported quota types + +Redpanda supports three Kafka API-based quota types: + +|=== +| Quota type | Description + +| `producer_byte_rate` +| Limit throughput of produce requests (bytes per second) + +| `consumer_byte_rate` +| Limit throughput of fetch requests (bytes per second) + +| `controller_mutation_rate` +| Limit rate of topic mutation requests (partitions created or deleted per second) +|=== + +All quota types can be applied to groups of client connections based on user principals, client IDs, or combinations of both. + +== Quota entities + +Redpanda uses two pieces of identifying information from each client connection to determine which quota applies: + +* Client ID: An ID that clients self-declare. Quotas can target an exact client ID (`client-id`) or a prefix (`client-id-prefix`). Multiple client connections that share a client ID or ID prefix are grouped into a single quota entity. +* User glossterm:principal[]: An authenticated identity verified through SASL, mTLS, or OIDC. Connections that share the same user are considered one entity. + +You can configure quotas that target either entity type, or combine both for fine-grained control. + +=== Client ID-based quotas + +Client ID-based quotas apply to clients identified by their `client-id` field, which is set by the client application. The client ID is typically a configurable property when you create a client with Kafka libraries. When using client ID-based quotas, multiple clients using the same client ID share the same quota tracking. + +Client ID-based quotas rely on clients honestly reporting their identity and correctly setting the `client-id` property. This makes client ID-based quotas unsuitable for guaranteeing isolation between tenants. + +Use client ID-based quotas when: + +* Authentication is not enabled. +* Grouping by application or service name is sufficient. +* You operate a single-tenant environment where all clients are trusted. +* You need simple rate limiting without user-level isolation. + +=== User-based quotas + +IMPORTANT: User-based quotas require xref:manage:security/authentication.adoc[authentication] to be enabled on your cluster. + +User-based quotas apply to authenticated user principals. Each user has a separate quota, providing a way to limit the impact of individual users on the cluster. + +User-based quotas rely on Redpanda's authentication system to verify user identity. The user principal is extracted from SASL credentials, mTLS certificates, or OIDC tokens and cannot be forged by clients. + +Use user-based quotas when: + +* You operate a multi-tenant environment, such as SaaS platforms or enterprises with departments. +* You require isolation between users or tenants, to avoid noisy-neighbor issues. +* You need per-user billing or metering. + +=== Combined user and client quotas + +You can combine user and client identities for fine-grained control over specific (user, client) combinations. + +Use combined quotas when: + +* You need fine-grained control, for example: user `alice` using a specific application. +* Different rate limits apply to different apps used by the same user. For example, `alice`+'s+ `payment-processor` gets 10 MB/s, but `alice`+'s+ `analytics-consumer` gets 50 MB/s. See <> for examples. + +== Quota precedence and tracking + +When a request arrives, Redpanda resolves which quota to apply by matching the request's authenticated user principal and client ID against configured quotas. Redpanda applies the most specific match, using the precedence order in the following table (highest priority first). + +The precedence level that matches also determines how quota usage is tracked. Redpanda tracks quota usage using a tracker key that determines which connections share the same quota bucket. How connections are grouped into buckets depends on the type of entity the quota targets. + +To get independent quota tracking per user and client ID combination, configure quotas that include both dimensions, such as `/config/users//clients/` or `/config/users//clients/`. + +.Quota precedence, tracking, and isolation by configuration level +[cols="1,2,3,2,3", options="header"] +|=== +| Level | Match type | Config path | Tracker key | Isolation behavior + +| 1 +| Exact user + exact client +| `/config/users//clients/` +| `(user, client-id)` +| Each unique (user, client-id) pair tracked independently + +| 2 +| Exact user + client prefix +| `/config/users//client-id-prefix/` +| `(user, client-id-prefix)` +| Clients matching the prefix share tracking within that user + +| 3 +| Exact user + default client +| `/config/users//clients/` +| `(user, client-id)` +| Each unique (user, client-id) pair tracked independently + +| 4 +| Exact user only +| `/config/users/` +| `user` +| All clients for that user share a single tracking bucket + +| 5 +| Default user + exact client +| `/config/users//clients/` +| `(user, client-id)` +| Each unique (user, client-id) pair tracked independently + +| 6 +| Default user + client prefix +| `/config/users//client-id-prefix/` +| `(user, client-id-prefix)` +| Clients matching the prefix share tracking within each user + +| 7 +| Default user + default client +| `/config/users//clients/` +| `(user, client-id)` +| Each unique (user, client-id) pair tracked independently + +| 8 +| Default user only +| `/config/users/` +| `user` +| All clients for each user share a single tracking bucket (per user) + +| 9 +| Exact client only +| `/config/clients/` +| `client-id` +| All users with that client ID share a single tracking bucket + +| 10 +| Client prefix only +| `/config/client-id-prefix/` +| `client-id-prefix` +| All clients matching the prefix share a single bucket across all users + +| 11 +| Default client only +| `/config/clients/` +| `client-id` +| Each unique client ID tracked independently + +| 12 +| No quota configured +| N/A +| N/A +| No tracking / unlimited throughput +|=== + +IMPORTANT: The `` entity matches any user or client that doesn't have a more specific quota configured. This is different from an empty/unauthenticated user (`user=""`), or undeclared client ID (`client-id=""`), which are treated as specific entities. + +=== Unauthenticated connections + +Unauthenticated connections have an empty user principal (`user=""`) and are not treated as `user=`. + +Unauthenticated connections: + +* Fall back to client-only quotas. +* Have unlimited throughput only if no client-only quota matches. + +=== Example: Precedence resolution + +Given these configured quotas: + +[,bash] +---- +rpk cluster quotas alter --add consumer_byte_rate=5000000 --name user=alice --name client-id=app-1 +rpk cluster quotas alter --add consumer_byte_rate=10000000 --name user=alice +rpk cluster quotas alter --add consumer_byte_rate=20000000 --name client-id=app-1 +---- + +|=== +| User + Client ID | Precedence match + +| `user=alice`, `client-id=app-1` +| Level 1: Exact user + exact client + +| `user=alice`, `client-id=app-2` +| Level 4: Exact user only + +| `user=bob`, `client-id=app-1` +| Level 9: Exact client only + +| `user=bob`, `client-id=app-2` +| Level 12: No quota configured +|=== + +When no quota matches (level 12), the connection is not throttled. + +=== Example: User-only quota + +If you configure a 10 MB/s produce quota for user `alice`: + +[,bash] +---- +rpk cluster quotas alter --add producer_byte_rate=10000000 --name user=alice +---- + +Then `alice` connecting with client ID `app-1` and `alice` connecting with client ID `app-2` share the same 10 MB/s produce limit. + +To give each of `alice`+'s+ clients an independent 10 MB/s limit, configure: + +[,bash] +---- +rpk cluster quotas alter --add producer_byte_rate=10000000 --name user=alice --default client-id +---- + +=== Example: User default quota + +If you configure a default 10 MB/s produce quota for all users: + +[,bash] +---- +rpk cluster quotas alter --add producer_byte_rate=10000000 --default user +---- + +This quota applies to all users who don't have a more specific quota configured. Each user is tracked independently: `alice` gets her own 10 MB/s bucket, `bob` gets his own 10 MB/s bucket, and so on. + +Within each user, all client ID values share that user's bucket. `alice` connecting with client ID `app-1` and `alice` connecting with client ID `app-2` share the same 10 MB/s produce limit, while `bob`+'s+ connections have a separate 10 MB/s limit. + +[[throttling-enforcement]] +== Throughput throttling enforcement + +NOTE: As of v24.2, Redpanda enforces all throughput limits per broker, including client throughput. + +Redpanda enforces throughput limits by applying backpressure to clients. When a connection exceeds its throughput limit, Redpanda throttles the connection to bring the rate back within the allowed level: + +. Redpanda adds a `throttle_time_ms` field to responses, indicating how long the client should wait. +. If the client doesn't honor the throttle time, Redpanda inserts delays on the connection's next read operation. + +ifndef::env-cloud[] +The throttling delay may not exceed the limit set by the `max_kafka_throttle_delay_ms` tunable property. +endif::[] + +ifdef::env-cloud[] +In Redpanda Cloud, the throttling delay is set to 30 seconds. +endif::[] + +== Default behavior + +Quotas are opt-in restrictions and not enforced by default. When no quotas are configured, clients have unlimited throughput. + +== Next steps + +* xref:manage:cluster-maintenance/manage-throughput.adoc[Configure throughput quotas] +* xref:{authentication-doc}[Enable authentication for user-based quotas] diff --git a/modules/manage/pages/cluster-maintenance/manage-throughput.adoc b/modules/manage/pages/cluster-maintenance/manage-throughput.adoc index dab8e18374..ce1e6da001 100644 --- a/modules/manage/pages/cluster-maintenance/manage-throughput.adoc +++ b/modules/manage/pages/cluster-maintenance/manage-throughput.adoc @@ -1,32 +1,47 @@ = Manage Throughput -:description: Learn how to manage the throughput of Kafka traffic. +:description: Configure broker-wide and client-specific throughput quotas to prevent resource exhaustion and noisy-neighbor issues. :page-categories: Management, Networking +:page-topic-type: how-to +:personas: platform_admin, developer +:learning-objective-1: Set user-based throughput quotas +:learning-objective-2: Set client ID-based quotas +:learning-objective-3: Monitor quota usage and throttling behavior // tag::single-source[] ifdef::env-cloud[] :monitor-doc: manage:monitor-cloud.adoc#throughput :connected-clients-api-doc-ref: link:/api/doc/cloud-dataplane/operation/operation-monitoringservice_listkafkaconnections +:authentication-doc: security:cloud-authentication.adoc endif::[] ifndef::env-cloud[] :monitor-doc: manage:monitoring.adoc#throughput :connected-clients-api-doc-ref: link:/api/doc/admin/v2/operation/operation-redpanda-core-admin-v2-clusterservice-listkafkaconnections +:authentication-doc: manage:security/authentication.adoc endif::[] -Redpanda supports throughput throttling on both ingress and egress independently, and allows configuration at the broker and client levels. This helps prevent clients from causing unbounded network and disk usage on brokers. You can configure limits at two levels: +Redpanda throttles throughput on ingress and egress independently, and you can configure limits at the broker and client levels. This prevents clients from causing unbounded network and disk usage on brokers. -* *Broker limits*: These apply to all clients connected to the broker and restrict total traffic on the broker. See <>. +You can configure limits at two levels: + +* Broker limits: These apply to all clients connected to the broker and restrict total traffic on the broker. See <>. ifndef::env-cloud[] -* *Client limits*: These apply to a set of clients defined by their `client_id` and help prevent a set of clients from starving other clients using the same broker. You can manage client quotas with xref:reference:rpk/rpk-cluster/rpk-cluster-quotas.adoc[`rpk cluster quotas`], with {ui}, or with the Kafka API. When no quotas apply, the client has unlimited throughput. +* Client limits: These apply to authenticated users or clients defined by their client ID. You can manage client quotas with xref:reference:rpk/rpk-cluster/rpk-cluster-quotas.adoc[`rpk cluster quotas`], with {ui}, or with the Kafka API. When no quotas apply, the client has unlimited throughput. endif::[] ifdef::env-cloud[] -* *Client limits*: These apply to a set of clients defined by their `client_id` and help prevent a set of clients from starving other clients using the same broker. You can manage client quotas with xref:reference:rpk/rpk-cluster/rpk-cluster-quotas.adoc[`rpk cluster quotas`], with the {ui} UI, with the link:https://docs.redpanda.com/api/doc/cloud-dataplane/operation/operation-quotaservice_listquotas[Redpanda Cloud Data Plane API], or with the Kafka API. When no quotas apply, the client has unlimited throughput. +* Client limits: These apply to authenticated users or clients defined by their client ID. You can manage client quotas with xref:reference:rpk/rpk-cluster/rpk-cluster-quotas.adoc[`rpk cluster quotas`], with the {ui} UI, with the link:https://docs.redpanda.com/api/doc/cloud-dataplane/operation/operation-quotaservice_listquotas[Redpanda Cloud Data Plane API], or with the Kafka API. When no quotas apply, the client has unlimited throughput. NOTE: Throughput throttling is supported for BYOC and Dedicated clusters only. endif::[] +After reading this page, you will be able to: + +* [ ] {learning-objective-1} +* [ ] {learning-objective-2} +* [ ] {learning-objective-3} + == View connected client details -You may find it helpful to check the xref:{monitor-doc}[current produce and consume throughput] of a client before you configure throughput quotas. +Before configuring throughput quotas, check the xref:{monitor-doc}[current produce and consume throughput] of a client. ifndef::env-cloud[] Use the xref:reference:rpk/rpk-cluster/rpk-cluster-connections-list.adoc[`rpk cluster connections list`] command or the {connected-clients-api-doc-ref}[ListKafkaConnections] Admin API endpoint to view detailed information about active Kafka client connections. @@ -258,6 +273,8 @@ UID STATE USER CLIENT-ID b41584f3-2662-4185-a4b8-0d8510f5c780 OPEN UNAUTHENTICATED perf-producer-client 127.0.0.1:55002 0 0 8s 7.743592270s 0B 0B 1 b20601a3-624c-4a8c-ab88-717643f01d56 OPEN UNAUTHENTICATED perf-producer-client 127.0.0.1:55012 0 0 9s 0s 78.9MB 0B 292 ---- + +The `USER` field in the connection list shows the authenticated principal. Unauthenticated connections show `UNAUTHENTICATED`, which corresponds to an empty user principal (`user=""`) in quota configurations, not `user=`. -- ifndef::env-cloud[] @@ -417,24 +434,19 @@ curl \ } ---- ==== +The user principal field in the connection list shows the authenticated principal. Unauthenticated connections show `AUTHENTICATION_STATE_UNAUTHENTICATED`, which corresponds to an empty user principal (`user=""`) in quota configurations, not `user=`. -- endif::[] ====== +To view connections for a specific authenticated user: -== Throughput throttling enforcement - -NOTE: As of v24.2, Redpanda enforces all throughput limits per broker, including client throughput. - -Throughput limits are enforced by applying backpressure to clients. When a connection is in breach of the throughput limit, the throttler advises the client about the delay (throttle time) that would bring the rate back to the allowed level. Redpanda starts by adding a `throttle_time_ms` field to responses. If that isn't honored, delays are inserted on the connection's next read operation. +[,bash] +---- +rpk cluster connections list --user alice +---- -ifdef::env-cloud[] -In Redpanda Cloud, the throttling delay is set to 30 seconds. -endif::[] - -ifndef::env-cloud[] -The throttling delay may not exceed the limit set by xref:reference:tunable-properties.adoc#max_kafka_throttle_delay_ms[`max_kafka_throttle_delay_ms`]. -endif::[] +This shows all connections from user `alice`, useful for monitoring clients that are subject to user-based quotas. == Broker-wide throughput limits @@ -469,47 +481,96 @@ The properties for broker-wide throughput quota balancing are configured at the ==== By default, both `kafka_throughput_limit_node_in_bps` and `kafka_throughput_limit_node_out_bps` are disabled, and no throughput limits are applied. You must manually set them to enable throughput throttling. ==== + +To set broker-wide throughput limits, use xref:reference:rpk/rpk-cluster/rpk-cluster-config-set.adoc[`rpk cluster config set`] to configure the cluster properties: + +[,bash] +---- +# Set ingress limit to 100 MB/s per broker +rpk cluster config set kafka_throughput_limit_node_in_bps 100000000 + +# Set egress limit to 200 MB/s per broker +rpk cluster config set kafka_throughput_limit_node_out_bps 200000000 +---- endif::[] == Client throughput limits -Redpanda provides configurable throughput quotas that apply to an individual client or a group of clients. You can apply a quota for an individual client based on an exact match with its `client_id`, or a group of clients based on IDs that start with a given prefix. +Redpanda provides configurable throughput quotas for individual clients or authenticated users. Quotas are managed through the Kafka-compatible AlterClientQuotas and DescribeClientQuotas APIs, accessible with `rpk`, Redpanda Console, or Kafka client libraries. -As of v24.2, client throughput quotas are compatible with the https://cwiki.apache.org/confluence/display/KAFKA/KIP-546%3A+Add+Client+Quota+APIs+to+the+Admin+Client[AlterClientQuotas and DescribeClientQuotas^] Kafka APIs, and are separate from quotas configured through cluster configuration in earlier Redpanda versions. The client throughput quotas no longer apply on a per-shard basis, and now limit the rates across a Redpanda broker's node. The quotas are neither shared nor balanced between brokers. +Redpanda supports two types of client throughput quotas: -Redpanda supports the following Kafka API-based quota types on clients: +* Client ID-based quotas: Limit throughput based on the self-declared `client-id` field. +* User-based quotas: Limit throughput based on authenticated user glossterm:principal[]. Requires xref:{authentication-doc}[authentication]. -|=== -| Quota type | Description +You can also combine both types for fine-grained control (for example, limiting a specific user when using a specific client application). -| `producer_byte_rate` -| Limit throughput of produce requests +For conceptual information about quota types, entity hierarchy, precedence rules, and how Redpanda tracks and enforces quotas through throttling, see xref:manage:cluster-maintenance/about-throughput-quotas.adoc[]. -| `consumer_byte_rate` -| Limit throughput of fetch requests +=== Set user-based quotas -| `controller_mutation_rate` -| Limit rate of topic mutation requests, including create, add, and delete partition, in number of partitions per second +IMPORTANT: User-based quotas require authentication to be enabled. To set up authentication, see xref:{authentication-doc}[]. -|=== +==== Quota for a specific user + +To limit throughput for a specific authenticated user across all clients: + +[,bash] +---- +rpk cluster quotas alter --add producer_byte_rate=2000000 --name user=alice +---- -You can also apply a default quota for all other client requests that don't have a specific quota based on an exact match or `client_id` prefix. +This limits user `alice` to 2 MB/s for produce requests regardless of the client ID used. -It is possible to create conflicting quotas if you configure the same quotas through both the Kafka API and a cluster configuration. Redpanda resolves these conflicts by following an order of preference in finding a matching quota for a request: +To view quotas for a user: -. Quota configured through the Kafka API for an exact match on `client_id` -. Quota configured through the Kafka API for a prefix match on `client_id` -ifndef::env-cloud[] -. Quota configured through cluster configuration properties (`kafka_client_group_byte_rate_quota`, `kafka_client_group_fetch_byte_rate_quota`-deprecated in v24.2) for a prefix match on `client_id` -endif::[] -. Default quota configured through the Kafka API on `client_id` -ifndef::env-cloud[] -. Default quota configured through cluster configuration properties (`target_quota_byte_rate`, `target_fetch_quota_byte_rate`, `kafka_admin_topic_api_rate`-deprecated in v24.2) on `client_id` +[,bash] +---- +rpk cluster quotas describe --name user=alice +---- -Redpanda recommends <> over from cluster configuration-managed quotas to Kafka-compatible quotas. You can re-create the configuration-based quotas with `rpk`, and then remove the cluster configurations. -endif::[] +Expected output: + +[,bash,role=no-copy] +---- +user=alice + producer_byte_rate=2000000 +---- + +==== Default quota for all users + +To set a fallback quota for any user without a more specific quota: + +[,bash] +---- +rpk cluster quotas alter --add consumer_byte_rate=5000000 --default user +---- + +This applies a 5 MB/s fetch quota to all authenticated users who don't have a more specific quota configured. + +=== Remove a user quota + +To remove a quota for a specific user: + +[,bash] +---- +rpk cluster quotas alter --delete consumer_byte_rate --name user=alice +---- + +To remove all quotas for a user: + +[,bash] +---- +rpk cluster quotas delete --name user=alice +---- + +=== Set client ID-based quotas -=== Individual client throughput limit +Client ID-based quotas apply to all users using a specific client ID. These quotas do not require authentication. Because the client ID is self-declared, client ID-based quotas are not suitable for guaranteeing isolation between tenants. + +For multi-tenant environments, Redpanda recommends user-based quotas for per-tenant isolation. + +==== Individual client ID throughput limit ifdef::env-cloud[] NOTE: The following sections show how to manage throughput with `rpk`. You can also manage throughput with the link:https://docs.redpanda.com/api/doc/cloud-dataplane/operation/operation-quotaservice_listquotas[Redpanda Cloud Data Plane API]. @@ -531,7 +592,7 @@ client-id=consumer-1 ---- -To set a throughput quota for a single client, use the xref:reference:rpk/rpk-cluster/rpk-cluster-quotas-alter.adoc[`rpk cluster quotas alter`] command. +To set a throughput quota for a single client, use the xref:reference:rpk/rpk-cluster/rpk-cluster-quotas-alter.adoc[`rpk cluster quotas alter`] command. [,bash] ---- @@ -544,7 +605,7 @@ ENTITY STATUS client-id=consumer-1 OK ---- -=== Group of clients throughput limit +==== Group of clients throughput limit Alternatively, you can view or configure throughput quotas for a group of clients based on a match on client ID prefix. The following example sets the `consumer_byte_rate` quota to client IDs prefixed with `consumer-`: @@ -553,12 +614,11 @@ Alternatively, you can view or configure throughput quotas for a group of client rpk cluster quotas alter --add consumer_byte_rate=200000 --name client-id-prefix=consumer- ---- -NOTE: A client group specified with `client-id-prefix` is not the equivalent of a Kafka consumer group. It is used only to match requests based on the `client_id` prefix. The `client_id` field is typically a configurable property when you create a client with Kafka libraries. - +NOTE: A `client-id-prefix` quota group is not related to Kafka consumer groups. The client ID is an application-defined identifier sent with every request. Client libraries typically default to their own name (such as `kgo`, `rdkafka`, `sarama`, or `perf-producer-client`), but applications can set it using the https://kafka.apache.org/documentation/#consumerconfigs_client.id[`client.id`^] configuration property. This makes prefix-based quotas useful for grouping related applications (for example, `inventory-service-` to match `inventory-service-1`, `inventory-service-2`, etc.). -=== Default client throughput limit +==== Default client throughput limit -You can apply default throughput limits to clients. Redpanda applies the default limits if no quotas are configured for a specific `client_id` or prefix. +You can apply default throughput limits to clients. Redpanda applies the default limits if no quotas are configured for a specific client ID or prefix. To specify a produce quota of 1 GB/s through the Kafka API (applies across all produce requests to a single broker), run: @@ -567,94 +627,70 @@ To specify a produce quota of 1 GB/s through the Kafka API (applies across all p rpk cluster quotas alter --default client-id --add producer_byte_rate=1000000000 ---- -=== Bulk manage client throughput limits +=== Set combined user and client quotas -To more easily manage multiple quotas, you can use the `cluster quotas describe` and xref:reference:rpk/rpk-cluster/rpk-cluster-quotas-import.adoc[`cluster quotas import`] commands to do a bulk export and update. +You can set quotas for specific (user, client ID) combinations for fine-grained control. -For example, to export all client quotas in JSON format: +==== User with specific client + +To limit a specific user when using a specific client: [,bash] ---- -rpk cluster quotas describe --format json +rpk cluster quotas alter --add consumer_byte_rate=1000000 --name user=alice --name client-id=consumer-1 ---- -`rpk cluster quotas import` accepts the output string from `rpk cluster quotas describe --format `: +User `alice` using `client-id=consumer-1` is limited to a 1 MB/s fetch rate. The same user with a different client ID would use a different quota (or fall back to less specific matches). + +To view combined quotas: [,bash] ---- -rpk cluster quotas import --from '{"quotas":[{"entity":[{"name":"foo","type":"client-id"}],"values":[{"key":"consumer_byte_rate","values":"12123123"}]},{"entity":[{"name":"foo-","type":"client-id-prefix"}],"values":[{"key":"producer_byte_rate","values":"12123123"},{"key":"consumer_byte_rate","values":"4444444"}]}]}' +rpk cluster quotas describe --name user=alice --name client-id=consumer-1 ---- -You can also save the JSON or YAML output to a file and pass the file path in the `--from` flag. +==== User with client prefix -[[migrate]] -=== Migrate cluster configuration quotas to Kafka API-based quotas +To set a shared quota for a user across multiple clients matching a prefix: -. Use xref:reference:rpk/rpk-cluster/rpk-cluster-config-get.adoc[`rpk cluster config get`] to view current client quotas managed with cluster configuration. The following example shows how to retrieve the `kafka_client_group_byte_rate_quota` for two groups of producers: -+ [,bash] ---- -rpk cluster config get kafka_client_group_byte_rate_quota - ----- -+ -[,bash,role=no-copy] ----- -"kafka_client_group_byte_rate_quota": [ - { - "group_name": "group_1", - "clients_prefix": "producer_group_alone_producer", - "quota": 10240 - }, - { "group_name": "group_2", - "clients_prefix": "producer_group_multiple", - "quota": 20480 - } -] +rpk cluster quotas alter --add producer_byte_rate=3000000 --name user=bob --name client-id-prefix=app- ---- -ifndef::env-cloud[] -. Each client quota cluster property (xref:upgrade:deprecated/index.adoc[deprecated in v24.2]) corresponds to a quota type in Kafka. Check the corresponding `rpk` arguments to use when setting the new quota values: -+ -|=== -| Cluster configuration property | `rpk cluster quotas` arguments -| `target_quota_byte_rate` -| `--default client-id --add producer_byte_rate=` +All clients used by user `bob` with a client ID starting with `app-` share a combined 3 MB/s produce quota. -| `target_fetch_quota_byte_rate` -| `--default client-id --add consumer_byte_rate=` +==== Default user with specific client -| `kafka_admin_topic_api_rate` -| `--default client-id --add controller_mutation_rate=` +To set a quota for a specific client across all users: -| `kafka_client_group_byte_rate_quota` -| `--name client-id-prefix= --add producer_byte_rate=` +[,bash] +---- +rpk cluster quotas alter --add producer_byte_rate=500000 --default user --name client-id=payment-processor +---- -| `kafka_client_group_fetch_byte_rate_quota` -| `--name client-id-prefix= --add consumer_byte_rate=` +Any user using `client-id=payment-processor` is limited to a 500 KB/s produce rate, unless they have a more specific quota configured. -|=== -+ -The client throughput quotas set through the Kafka API apply per broker, so you must convert the cluster configuration values that were applied on a per-shard (logical CPU core) basis. For example, if you set `target_fetch_quota_byte_rate` to 100 MBps/shard, and you run Redpanda on 16-core brokers, you can set the new consumer_byte_rate quota to 100 * 16 = 1600 MBps. -endif::[] +=== Bulk manage client throughput limits + +To more easily manage multiple quotas, you can use the `cluster quotas describe` and xref:reference:rpk/rpk-cluster/rpk-cluster-quotas-import.adoc[`cluster quotas import`] commands to do a bulk export and update. + +For example, to export all client quotas in JSON format: -. Use `rpk cluster quotas alter` to set the corresponding client throughput quotas based on the Kafka API: -+ [,bash] ---- -rpk cluster quotas alter --name client-id-prefix=producer_group_alone_producer --add producer_byte_rate= -rpk cluster quotas alter --name client-id-prefix=producer_group_multiple --add producer_byte_rate= +rpk cluster quotas describe --format json ---- -+ -Replace the placeholder values with the new quota values, accounting for the conversion to per-broker limits. For example, 10240 * broker core count = new quota. -. Use xref:reference:rpk/rpk-cluster/rpk-cluster-config-set.adoc[`rpk cluster config set`] to remove the configuration-based quotas: -+ +`rpk cluster quotas import` accepts the output string from `rpk cluster quotas describe --format `: + [,bash] ---- -rpk cluster config set kafka_client_group_byte_rate_quota= +rpk cluster quotas import --from '{"quotas":[{"entity":[{"name":"analytics-consumer","type":"client-id"}],"values":[{"key":"consumer_byte_rate","values":"10000000"}]},{"entity":[{"name":"analytics-","type":"client-id-prefix"}],"values":[{"key":"producer_byte_rate","values":"10000000"},{"key":"consumer_byte_rate","values":"5000000"}]}]}' ---- +You can also save the JSON or YAML output to a file and pass the file path in the `--from` flag. + === View throughput limits in {ui} You can also use {ui} to view enforced limits. In the side menu, go to **Quotas**. @@ -674,6 +710,11 @@ ifndef::env-cloud[] ** `/metrics` - xref:reference:internal-metrics-reference.adoc#vectorized_kafka_quotas_client_quota_throttle_time[`vectorized_kafka_quotas_client_quota_throttle_time`] endif::[] +To identify which clients are actively connected and generating traffic, see <>. + +Quota metrics use the `redpanda_quota_rule` label to identify which quota was applied to a request. The label distinguishes between different entity types (user, client, or combinations). See the label values in xref:reference:public-metrics-reference.adoc#redpanda_kafka_quotas_client_quota_throughput[`redpanda_kafka_quotas_client_quota_throughput`]. + +ifndef::env-cloud[] The `kafka_quotas` logger provides details at the trace level on client quota throttling: [,bash] @@ -684,9 +725,12 @@ TRACE 2024-06-14 15:37:44,835 [shard 2:main] kafka_quotas - quota_manager.cc:36 TRACE 2024-06-14 15:37:59,195 [shard 2:main] kafka_quotas - quota_manager.cc:361 - request: ctx:{quota_type: produce_quota, client_id: {rpk}}, key:k_client_id{rpk}, value:{limit: {1111}, rule: kafka_client_default}, bytes: 1316, delay:184518451ns, capped_delay:184518451ns TRACE 2024-06-14 15:37:59,195 [shard 2:main] kafka_quotas - connection_context.cc:605 - [127.0.0.1:58636] throttle request:{snc:0, client:184}, enforce:{snc:-14359, client:-14359}, key:0, request_size:1316 ---- +endif::[] == See also -- xref:manage:cluster-maintenance/configure-client-connections.adoc[Configure Client Connections] +- xref:manage:cluster-maintenance/about-throughput-quotas.adoc[] +- xref:manage:cluster-maintenance/configure-client-connections.adoc[] +- xref:{authentication-doc}[] // end::single-source[]