From c700f069fabf57d8c486506944674cdae754b5e9 Mon Sep 17 00:00:00 2001
From: openhands <openhands@all-hands.dev>
Date: Fri, 10 Apr 2026 18:47:00 +0000
Subject: [PATCH 01/11] docs: add API key rate limits guide for Kubernetes
 installation

Add documentation for the per-API-key rate limiting feature in the Runtime API.
This feature was implemented in OpenHands/runtime-api PR #457 (APP-1117).

The guide covers:
- How rate limiting works (per-key, fixed window strategy)
- Configuring rate limits when creating or updating API keys
- API reference for CreateApiKeyRequest, UpdateApiKeyRequest, ApiKeyResponse
- Recommended configurations for different use cases
- Monitoring and troubleshooting rate limit issues

Co-authored-by: openhands <openhands@all-hands.dev>
---
 docs.json                              |   3 +-
 enterprise/k8s-install/index.mdx       |  11 +-
 enterprise/k8s-install/rate-limits.mdx | 217 +++++++++++++++++++++++++
 3 files changed, 227 insertions(+), 4 deletions(-)
 create mode 100644 enterprise/k8s-install/rate-limits.mdx
diff --git a/docs.json b/docs.json
index d4261a9b6..bf9d4119c 100644
--- a/docs.json
+++ b/docs.json
@@ -412,7 +412,8 @@
             "group": "K8s Install",
             "pages": [
               "enterprise/k8s-install/index",
-              "enterprise/k8s-install/resource-limits"
+              "enterprise/k8s-install/resource-limits",
+              "enterprise/k8s-install/rate-limits"
             ]
           },
           {
diff --git a/enterprise/k8s-install/index.mdx b/enterprise/k8s-install/index.mdx
index db70d66a8..cf4208ef0 100644
--- a/enterprise/k8s-install/index.mdx
+++ b/enterprise/k8s-install/index.mdx
@@ -50,9 +50,14 @@ OpenHands Enterprise consists of several components deployed as Kubernetes workl
 
 ## Guides
 
-<Card title="Resource Limits" icon="gauge-high" href="/enterprise/k8s-install/resource-limits">
-  Configure memory, CPU, and storage for optimal performance.
-</Card>
+<CardGroup cols={2}>
+  <Card title="Resource Limits" icon="gauge-high" href="/enterprise/k8s-install/resource-limits">
+    Configure memory, CPU, and storage for optimal performance.
+  </Card>
+  <Card title="API Key Rate Limits" icon="gauge" href="/enterprise/k8s-install/rate-limits">
+    Configure per-API-key rate limits for the Runtime API.
+  </Card>
+</CardGroup>
 
 ## Request Access
 
diff --git a/enterprise/k8s-install/rate-limits.mdx b/enterprise/k8s-install/rate-limits.mdx
new file mode 100644
index 000000000..26f99ca0c
--- /dev/null
+++ b/enterprise/k8s-install/rate-limits.mdx
@@ -0,0 +1,217 @@
+---
+title: API Key Rate Limits
+description: Configure per-API-key rate limits for the Runtime API
+icon: gauge
+---
+
+This guide explains how to configure rate limits for API keys in OpenHands Enterprise.
+Rate limiting helps prevent abuse and ensures fair resource allocation across users
+and automated systems.
+
+## Overview
+
+The Runtime API supports **per-API-key rate limiting**, allowing you to set different
+rate limits for different API keys based on their use case. This is useful for:
+
+- Protecting against runaway automation scripts
+- Allocating higher limits to production integrations
+- Setting conservative limits for evaluation/test keys
+- Preventing a single user from overwhelming the system
+
+## How Rate Limiting Works
+
+Rate limiting is applied at the authentication layer. When a request comes in:
+
+1. The API key is validated
+2. If the key has `max_requests_per_minute` configured, that limit is enforced
+3. If no limit is configured, no rate limiting is applied
+4. Requests exceeding the limit receive HTTP 429 (Too Many Requests)
+
+<Info>
+  Rate limits are tracked per API key using a fixed window strategy. The counter
+  resets at the start of each minute.
+</Info>
+
+## Configuring Rate Limits
+
+### When Creating an API Key
+
+You can set a rate limit when creating a new API key via the management API:
+
+```bash
+curl -X POST https://your-runtime-api/api/keys \
+  -H "Content-Type: application/json" \
+  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
+  -d '{
+    "name": "my-production-key",
+    "max_runtimes": 10,
+    "max_requests_per_minute": 500,
+    "initial_credits": 100.0,
+    "key_type": "evaluation"
+  }'
+```
+
+### Updating an Existing API Key
+
+To add or modify the rate limit on an existing key:
+
+```bash
+curl -X PUT https://your-runtime-api/api/keys/{key_id} \
+  -H "Content-Type: application/json" \
+  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
+  -d '{
+    "max_requests_per_minute": 1000
+  }'
+```
+
+### Removing a Rate Limit
+
+To remove a rate limit from an API key (allowing unlimited requests):
+
+```bash
+curl -X PUT https://your-runtime-api/api/keys/{key_id} \
+  -H "Content-Type: application/json" \
+  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
+  -d '{
+    "max_requests_per_minute": null
+  }'
+```
+
+## API Reference
+
+### CreateApiKeyRequest
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `name` | string | (required) | Unique identifier for the API key |
+| `max_runtimes` | integer | null | Maximum concurrent runtimes allowed |
+| `max_requests_per_minute` | integer | null | Rate limit (requests per minute) |
+| `initial_credits` | float | 0.0 | Starting credit balance |
+| `key_type` | string | "evaluation" | Key type: "evaluation" or "production" |
+| `is_test_key` | boolean | false | Mark as a test key |
+
+### UpdateApiKeyRequest
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `name` | string | Update the key name |
+| `max_runtimes` | integer | Update max concurrent runtimes |
+| `max_requests_per_minute` | integer | Update rate limit |
+| `remaining_credits` | float | Update credit balance |
+
+### ApiKeyResponse
+
+The API returns the full key configuration including the rate limit:
+
+```json
+{
+  "id": 1,
+  "name": "my-production-key",
+  "key_value": "sk-...",
+  "max_runtimes": 10,
+  "max_requests_per_minute": 500,
+  "remaining_credits": 100.0,
+  "key_type": "evaluation",
+  "is_test_key": false
+}
+```
+
+## Rate Limit Responses
+
+When a rate limit is exceeded, the API returns:
+
+```
+HTTP/1.1 429 Too Many Requests
+Content-Type: application/json
+
+{
+  "detail": "Rate limit exceeded: 500/minute"
+}
+```
+
+<Warning>
+  Clients should implement exponential backoff when receiving 429 responses.
+  Continued requests during rate limiting may result in extended blocking.
+</Warning>
+
+## Recommended Configurations
+
+### Evaluation/Testing Keys
+
+For keys used in development, testing, or evaluation:
+
+```json
+{
+  "name": "eval-key",
+  "max_requests_per_minute": 100,
+  "key_type": "evaluation",
+  "is_test_key": true
+}
+```
+
+### Production Integration Keys
+
+For automated CI/CD pipelines or production applications:
+
+```json
+{
+  "name": "ci-cd-production",
+  "max_requests_per_minute": 500,
+  "key_type": "production"
+}
+```
+
+### High-Volume Automation Keys
+
+For trusted systems requiring higher throughput:
+
+```json
+{
+  "name": "batch-processor",
+  "max_requests_per_minute": 1000,
+  "max_runtimes": 20
+}
+```
+
+## Monitoring Rate Limits
+
+The Runtime API logs rate limit events. Monitor your logs for entries like:
+
+```
+Rate limit exceeded for my-production-key at /start
+```
+
+For production deployments, we recommend:
+
+- Setting up alerts for frequent 429 responses
+- Monitoring request patterns per API key
+- Reviewing rate limit configurations periodically
+
+## Troubleshooting
+
+### "Rate limit exceeded" errors for legitimate traffic
+
+If legitimate requests are being rate limited:
+
+1. **Check current configuration**: Retrieve the API key to see current limits
+2. **Analyze request patterns**: Review logs to understand request frequency
+3. **Adjust limits**: Increase `max_requests_per_minute` as needed
+
+### Requests not being rate limited
+
+If rate limiting doesn't appear to be working:
+
+1. **Verify the key has a limit set**: Check that `max_requests_per_minute` is not null
+2. **Check the correct key is being used**: Verify the `X-API-Key` header matches
+3. **Review runtime-api logs**: Look for rate limiting log entries
+
+## Related Configuration
+
+<CardGroup cols={2}>
+  <Card title="Resource Limits" icon="gauge-high" href="/enterprise/k8s-install/resource-limits">
+    Configure memory, CPU, and storage limits for sandboxes.
+  </Card>
+  <Card title="K8s Install Overview" icon="dharmachakra" href="/enterprise/k8s-install/index">
+    Return to the Kubernetes installation overview.
+  </Card>
+</CardGroup>

From 27a7f7e13251c4841d13137e921d91d34f58eed5 Mon Sep 17 00:00:00 2001
From: openhands <openhands@all-hands.dev>
Date: Fri, 10 Apr 2026 19:11:24 +0000
Subject: [PATCH 02/11] docs: clarify that no rate limiting is applied when
 max_requests_per_minute is not set

Add warning to make it clear that API keys without max_requests_per_minute
configured will have no rate limiting applied.

Co-authored-by: openhands <openhands@all-hands.dev>
---
 enterprise/k8s-install/rate-limits.mdx | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/enterprise/k8s-install/rate-limits.mdx b/enterprise/k8s-install/rate-limits.mdx
index 26f99ca0c..80a2061fe 100644
--- a/enterprise/k8s-install/rate-limits.mdx
+++ b/enterprise/k8s-install/rate-limits.mdx
@@ -24,8 +24,13 @@ Rate limiting is applied at the authentication layer. When a request comes in:
 
 1. The API key is validated
 2. If the key has `max_requests_per_minute` configured, that limit is enforced
-3. If no limit is configured, no rate limiting is applied
-4. Requests exceeding the limit receive HTTP 429 (Too Many Requests)
+3. Requests exceeding the limit receive HTTP 429 (Too Many Requests)
+
+<Warning>
+  If `max_requests_per_minute` is not set on an API key, **no rate limiting is applied**
+  to requests using that key. Ensure all production API keys have appropriate rate limits
+  configured to prevent abuse.
+</Warning>
 
 <Info>
   Rate limits are tracked per API key using a fixed window strategy. The counter

From 8de149ffcd207d125c2d15c7a7409db96aedab46 Mon Sep 17 00:00:00 2001
From: openhands <openhands@all-hands.dev>
Date: Fri, 10 Apr 2026 19:20:28 +0000
Subject: [PATCH 03/11] docs: remove recommended configurations section

Remove examples that might imply specific default values.

Co-authored-by: openhands <openhands@all-hands.dev>
---
 enterprise/k8s-install/rate-limits.mdx | 39 --------------------------
 1 file changed, 39 deletions(-)

diff --git a/enterprise/k8s-install/rate-limits.mdx b/enterprise/k8s-install/rate-limits.mdx
index 80a2061fe..21c77ebcb 100644
--- a/enterprise/k8s-install/rate-limits.mdx
+++ b/enterprise/k8s-install/rate-limits.mdx
@@ -139,45 +139,6 @@ Content-Type: application/json
   Continued requests during rate limiting may result in extended blocking.
 </Warning>
 
-## Recommended Configurations
-
-### Evaluation/Testing Keys
-
-For keys used in development, testing, or evaluation:
-
-```json
-{
-  "name": "eval-key",
-  "max_requests_per_minute": 100,
-  "key_type": "evaluation",
-  "is_test_key": true
-}
-```
-
-### Production Integration Keys
-
-For automated CI/CD pipelines or production applications:
-
-```json
-{
-  "name": "ci-cd-production",
-  "max_requests_per_minute": 500,
-  "key_type": "production"
-}
-```
-
-### High-Volume Automation Keys
-
-For trusted systems requiring higher throughput:
-
-```json
-{
-  "name": "batch-processor",
-  "max_requests_per_minute": 1000,
-  "max_runtimes": 20
-}
-```
-
 ## Monitoring Rate Limits
 
 The Runtime API logs rate limit events. Monitor your logs for entries like:

From d14ed41e42ed3d55e71b9104a023e654589e97d6 Mon Sep 17 00:00:00 2001
From: openhands <openhands@all-hands.dev>
Date: Fri, 10 Apr 2026 19:22:15 +0000
Subject: [PATCH 04/11] docs: replace troubleshooting with focused logging
 section

Remove generic troubleshooting advice and add specific details about
the log message format and log level (WARNING) for rate limit events.

Co-authored-by: openhands <openhands@all-hands.dev>
---
 enterprise/k8s-install/rate-limits.mdx | 34 +++++++-------------------
 1 file changed, 9 insertions(+), 25 deletions(-)

diff --git a/enterprise/k8s-install/rate-limits.mdx b/enterprise/k8s-install/rate-limits.mdx
index 21c77ebcb..d5331ee9c 100644
--- a/enterprise/k8s-install/rate-limits.mdx
+++ b/enterprise/k8s-install/rate-limits.mdx
@@ -139,37 +139,21 @@ Content-Type: application/json
   Continued requests during rate limiting may result in extended blocking.
 </Warning>
 
-## Monitoring Rate Limits
+## Logging
 
-The Runtime API logs rate limit events. Monitor your logs for entries like:
+When a rate limit is exceeded, the Runtime API logs a warning message:
 
 ```
-Rate limit exceeded for my-production-key at /start
+Rate limit exceeded for {api_key_name} at {endpoint_path}
 ```
 
-For production deployments, we recommend:
-
-- Setting up alerts for frequent 429 responses
-- Monitoring request patterns per API key
-- Reviewing rate limit configurations periodically
-
-## Troubleshooting
-
-### "Rate limit exceeded" errors for legitimate traffic
-
-If legitimate requests are being rate limited:
-
-1. **Check current configuration**: Retrieve the API key to see current limits
-2. **Analyze request patterns**: Review logs to understand request frequency
-3. **Adjust limits**: Increase `max_requests_per_minute` as needed
-
-### Requests not being rate limited
-
-If rate limiting doesn't appear to be working:
+For example:
+```
+Rate limit exceeded for my-production-key at /start
+```
 
-1. **Verify the key has a limit set**: Check that `max_requests_per_minute` is not null
-2. **Check the correct key is being used**: Verify the `X-API-Key` header matches
-3. **Review runtime-api logs**: Look for rate limiting log entries
+This message is logged at the `WARNING` level. To capture these events, ensure your
+logging configuration includes `WARNING` level or higher for the Runtime API.
 
 ## Related Configuration
 

From f5dc430325e4b8234394de8b56ddffdcb9adf11e Mon Sep 17 00:00:00 2001
From: openhands <openhands@all-hands.dev>
Date: Fri, 10 Apr 2026 19:36:45 +0000
Subject: [PATCH 05/11] fix: address review comments for API key rate limits
 documentation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Addressed all critical accuracy issues identified in PR review:

- Fixed API endpoints: /api/keys → /api/admin/api-keys
- Fixed authentication mechanism: documented JWT challenge-response
  flow instead of incorrect X-Admin-Key header
- Fixed key_type enum values: documented 'evaluation' and 'openhands_app'
  instead of invalid 'production' value
- Fixed id field type: UUID string instead of integer
- Fixed API key prefix: 'ah-' instead of 'sk-'
- Fixed rate limit error message format: '500 per 1 minute' instead of
  '500/minute' to match limits library output

Co-authored-by: openhands <openhands@all-hands.dev>
---
 enterprise/k8s-install/rate-limits.mdx | 80 ++++++++++++++++++++++----
 1 file changed, 70 insertions(+), 10 deletions(-)

diff --git a/enterprise/k8s-install/rate-limits.mdx b/enterprise/k8s-install/rate-limits.mdx
index d5331ee9c..6236fa547 100644
--- a/enterprise/k8s-install/rate-limits.mdx
+++ b/enterprise/k8s-install/rate-limits.mdx
@@ -37,6 +37,66 @@ Rate limiting is applied at the authentication layer. When a request comes in:
   resets at the start of each minute.
 </Info>
 
+## Admin Authentication
+
+The Runtime API uses a challenge-response authentication protocol with PBKDF2 hashing
+for admin operations. Before making admin API calls, you must obtain a JWT token.
+
+### Step 1: Get a Challenge
+
+Request a challenge from the server:
+
+```bash
+curl -X GET https://your-runtime-api/api/admin/challenge
+```
+
+Response:
+```json
+{
+  "challenge": "550e8400-e29b-41d4-a716-446655440000",
+  "salt": "a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4",
+  "iterations": 10000
+}
+```
+
+### Step 2: Compute the Hash and Login
+
+Compute a PBKDF2-SHA256 hash of your admin password using the provided salt and challenge,
+then submit it to obtain a JWT token:
+
+```bash
+# The hash is computed as: PBKDF2-SHA256(password, salt + challenge, iterations, 32 bytes)
+# This example assumes you have computed the hash value
+
+curl -X POST https://your-runtime-api/api/admin/login \
+  -H "Content-Type: application/json" \
+  -d '{
+    "challenge": "550e8400-e29b-41d4-a716-446655440000",
+    "hash": "your_computed_pbkdf2_hash_hex"
+  }'
+```
+
+Response:
+```json
+{
+  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
+}
+```
+
+### Step 3: Use the Token
+
+Include the JWT token in subsequent API requests:
+
+```bash
+curl -X GET https://your-runtime-api/api/admin/api-keys \
+  -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
+```
+
+<Info>
+  JWT tokens expire after 24 hours. If you receive a 401 error, obtain a new token
+  by repeating the challenge-response flow.
+</Info>
+
 ## Configuring Rate Limits
 
 ### When Creating an API Key
@@ -44,9 +104,9 @@ Rate limiting is applied at the authentication layer. When a request comes in:
 You can set a rate limit when creating a new API key via the management API:
 
 ```bash
-curl -X POST https://your-runtime-api/api/keys \
+curl -X POST https://your-runtime-api/api/admin/api-keys \
   -H "Content-Type: application/json" \
-  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
+  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
   -d '{
     "name": "my-production-key",
     "max_runtimes": 10,
@@ -61,9 +121,9 @@ curl -X POST https://your-runtime-api/api/keys \
 To add or modify the rate limit on an existing key:
 
 ```bash
-curl -X PUT https://your-runtime-api/api/keys/{key_id} \
+curl -X PUT https://your-runtime-api/api/admin/api-keys/{key_id} \
   -H "Content-Type: application/json" \
-  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
+  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
   -d '{
     "max_requests_per_minute": 1000
   }'
@@ -74,9 +134,9 @@ curl -X PUT https://your-runtime-api/api/keys/{key_id} \
 To remove a rate limit from an API key (allowing unlimited requests):
 
 ```bash
-curl -X PUT https://your-runtime-api/api/keys/{key_id} \
+curl -X PUT https://your-runtime-api/api/admin/api-keys/{key_id} \
   -H "Content-Type: application/json" \
-  -H "X-Admin-Key: YOUR_ADMIN_KEY" \
+  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
   -d '{
     "max_requests_per_minute": null
   }'
@@ -92,7 +152,7 @@ curl -X PUT https://your-runtime-api/api/keys/{key_id} \
 | `max_runtimes` | integer | null | Maximum concurrent runtimes allowed |
 | `max_requests_per_minute` | integer | null | Rate limit (requests per minute) |
 | `initial_credits` | float | 0.0 | Starting credit balance |
-| `key_type` | string | "evaluation" | Key type: "evaluation" or "production" |
+| `key_type` | string | "evaluation" | Key type: `"evaluation"` or `"openhands_app"` |
 | `is_test_key` | boolean | false | Mark as a test key |
 
 ### UpdateApiKeyRequest
@@ -110,9 +170,9 @@ The API returns the full key configuration including the rate limit:
 
 ```json
 {
-  "id": 1,
+  "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
   "name": "my-production-key",
-  "key_value": "sk-...",
+  "key_value": "ah-12345678-1234-1234-1234-123456789abc",
   "max_runtimes": 10,
   "max_requests_per_minute": 500,
   "remaining_credits": 100.0,
@@ -130,7 +190,7 @@ HTTP/1.1 429 Too Many Requests
 Content-Type: application/json
 
 {
-  "detail": "Rate limit exceeded: 500/minute"
+  "detail": "Rate limit exceeded: 500 per 1 minute"
 }
 ```
 

From 71dd32e54d5300e4b824ce5c2f23119a4f67b0a4 Mon Sep 17 00:00:00 2001
From: openhands <openhands@all-hands.dev>
Date: Fri, 10 Apr 2026 20:06:04 +0000
Subject: [PATCH 06/11] docs: rewrite rate limits guide for admin audience
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Completely rewrote the documentation to be practical for cluster administrators:

- Added clear explanation of the internal API key architecture (OpenHands
  Server → Runtime API, not user-facing)
- Clarified that this is separate from user API keys (sk-oh-*)
- Added step-by-step kubectl commands to retrieve the admin password
- Provided complete bash script for the PBKDF2 authentication flow
- Showed how to find the existing 'default' key ID
- Added guidance on choosing appropriate rate limit values
- Included troubleshooting section with kubectl log commands
- Removed API reference tables (not needed for this single-key workflow)

Co-authored-by: openhands <openhands@all-hands.dev>
---
 enterprise/k8s-install/rate-limits.mdx | 289 +++++++++++++------------
 1 file changed, 148 insertions(+), 141 deletions(-)

diff --git a/enterprise/k8s-install/rate-limits.mdx b/enterprise/k8s-install/rate-limits.mdx
index 6236fa547..5f3e293f4 100644
--- a/enterprise/k8s-install/rate-limits.mdx
+++ b/enterprise/k8s-install/rate-limits.mdx
@@ -4,216 +4,223 @@ description: Configure per-API-key rate limits for the Runtime API
 icon: gauge
 ---
 
-This guide explains how to configure rate limits for API keys in OpenHands Enterprise.
-Rate limiting helps prevent abuse and ensures fair resource allocation across users
-and automated systems.
+This guide explains how to configure rate limits for the internal API key that
+connects the OpenHands server to the Runtime API. This is an **administrator task**
+typically performed after initial deployment if you need to enforce request limits.
 
-## Overview
+## Background
 
-The Runtime API supports **per-API-key rate limiting**, allowing you to set different
-rate limits for different API keys based on their use case. This is useful for:
+OpenHands Enterprise uses an internal API key to authenticate requests between two
+backend services:
 
-- Protecting against runaway automation scripts
-- Allocating higher limits to production integrations
-- Setting conservative limits for evaluation/test keys
-- Preventing a single user from overwhelming the system
+- **OpenHands Server** — the main application that users interact with
+- **Runtime API** — the service that manages sandbox containers
 
-## How Rate Limiting Works
-
-Rate limiting is applied at the authentication layer. When a request comes in:
-
-1. The API key is validated
-2. If the key has `max_requests_per_minute` configured, that limit is enforced
-3. Requests exceeding the limit receive HTTP 429 (Too Many Requests)
+```
+Users → OpenHands Server → (internal API key) → Runtime API → Sandboxes
+```
 
-<Warning>
-  If `max_requests_per_minute` is not set on an API key, **no rate limiting is applied**
-  to requests using that key. Ensure all production API keys have appropriate rate limits
-  configured to prevent abuse.
-</Warning>
+During installation, you created two Kubernetes secrets that hold the same key value:
+- `sandbox-api-key` — used by the OpenHands Server
+- `default-api-key` — used by the Runtime API
 
 <Info>
-  Rate limits are tracked per API key using a fixed window strategy. The counter
-  resets at the start of each minute.
+  This internal API key is **not** the same as user API keys (which start with `sk-oh-`).
+  Users never see or interact with this internal key.
 </Info>
 
-## Admin Authentication
-
-The Runtime API uses a challenge-response authentication protocol with PBKDF2 hashing
-for admin operations. Before making admin API calls, you must obtain a JWT token.
+## Default Behavior
 
-### Step 1: Get a Challenge
+By default, the internal API key has **no rate limit**. This means the OpenHands Server
+can make unlimited requests to the Runtime API.
 
-Request a challenge from the server:
+You may want to add a rate limit if:
+- You're experiencing resource contention in the Runtime API
+- You want to prevent runaway automation from overwhelming the system
+- You need to enforce fair usage across multiple OpenHands Server instances
 
-```bash
-curl -X GET https://your-runtime-api/api/admin/challenge
-```
+## How Rate Limiting Works
 
-Response:
-```json
-{
-  "challenge": "550e8400-e29b-41d4-a716-446655440000",
-  "salt": "a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4",
-  "iterations": 10000
-}
-```
+When configured, rate limiting is enforced per API key using a **fixed window** strategy:
 
-### Step 2: Compute the Hash and Login
+1. Each API key can have a `max_requests_per_minute` value
+2. Requests are counted within each 60-second window
+3. Requests exceeding the limit receive HTTP 429 (Too Many Requests)
 
-Compute a PBKDF2-SHA256 hash of your admin password using the provided salt and challenge,
-then submit it to obtain a JWT token:
+If `max_requests_per_minute` is not set (the default), no rate limiting is applied.
 
-```bash
-# The hash is computed as: PBKDF2-SHA256(password, salt + challenge, iterations, 32 bytes)
-# This example assumes you have computed the hash value
+## Configuring a Rate Limit
 
-curl -X POST https://your-runtime-api/api/admin/login \
-  -H "Content-Type: application/json" \
-  -d '{
-    "challenge": "550e8400-e29b-41d4-a716-446655440000",
-    "hash": "your_computed_pbkdf2_hash_hex"
-  }'
-```
+To add a rate limit to your existing deployment, you'll need to:
 
-Response:
-```json
-{
-  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
-}
-```
+1. Get the admin password from your Kubernetes secrets
+2. Authenticate to the Runtime API admin interface
+3. Find your API key's ID
+4. Update the key with a rate limit
 
-### Step 3: Use the Token
+### Prerequisites
 
-Include the JWT token in subsequent API requests:
+You'll need:
+- `kubectl` access to your OpenHands namespace
+- The Runtime API hostname (e.g., `runtimes.openhands.example.com`)
+- `curl` and `python3` installed locally
 
-```bash
-curl -X GET https://your-runtime-api/api/admin/api-keys \
-  -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
-```
+### Step 1: Get the Admin Password
 
-<Info>
-  JWT tokens expire after 24 hours. If you receive a 401 error, obtain a new token
-  by repeating the challenge-response flow.
-</Info>
+The admin password is stored in a Kubernetes secret created during installation:
 
-## Configuring Rate Limits
+```bash
+# Get the admin password
+ADMIN_PASSWORD=$(kubectl get secret admin-password -n openhands \
+  -o jsonpath='{.data.admin-password}' | base64 -d)
 
-### When Creating an API Key
+# Verify you got it (should print a 32-character string)
+echo $ADMIN_PASSWORD
+```
 
-You can set a rate limit when creating a new API key via the management API:
+### Step 2: Set Your Runtime API URL
 
 ```bash
-curl -X POST https://your-runtime-api/api/admin/api-keys \
-  -H "Content-Type: application/json" \
-  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
-  -d '{
-    "name": "my-production-key",
-    "max_runtimes": 10,
-    "max_requests_per_minute": 500,
-    "initial_credits": 100.0,
-    "key_type": "evaluation"
-  }'
+# Replace with your actual Runtime API hostname
+RUNTIME_API_URL="https://runtimes.openhands.example.com"
 ```
 
-### Updating an Existing API Key
+### Step 3: Authenticate and Get a JWT Token
 
-To add or modify the rate limit on an existing key:
+The Runtime API uses a challenge-response authentication flow. This script handles
+the PBKDF2 hash computation and login:
 
 ```bash
-curl -X PUT https://your-runtime-api/api/admin/api-keys/{key_id} \
+# Get a challenge from the server
+CHALLENGE_RESPONSE=$(curl -s "$RUNTIME_API_URL/api/admin/challenge")
+CHALLENGE=$(echo $CHALLENGE_RESPONSE | python3 -c "import sys,json; print(json.load(sys.stdin)['challenge'])")
+SALT=$(echo $CHALLENGE_RESPONSE | python3 -c "import sys,json; print(json.load(sys.stdin)['salt'])")
+
+# Compute the PBKDF2 hash
+HASH=$(python3 -c "
+import hashlib, binascii
+password = '$ADMIN_PASSWORD'
+salt = '$SALT'
+challenge = '$CHALLENGE'
+combined_salt = (salt + challenge).encode('utf-8')
+dk = hashlib.pbkdf2_hmac('sha256', password.encode(), combined_salt, 10000, dklen=32)
+print(binascii.hexlify(dk).decode())
+")
+
+# Login and get the JWT token
+TOKEN=$(curl -s -X POST "$RUNTIME_API_URL/api/admin/login" \
   -H "Content-Type: application/json" \
-  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
-  -d '{
-    "max_requests_per_minute": 1000
-  }'
+  -d "{\"challenge\": \"$CHALLENGE\", \"hash\": \"$HASH\"}" \
+  | python3 -c "import sys,json; print(json.load(sys.stdin)['token'])")
+
+# Verify you got a token (should print a long JWT string)
+echo $TOKEN
 ```
 
-### Removing a Rate Limit
+### Step 4: Find Your API Key ID
 
-To remove a rate limit from an API key (allowing unlimited requests):
+List all API keys to find the ID of your "default" key:
 
 ```bash
-curl -X PUT https://your-runtime-api/api/admin/api-keys/{key_id} \
-  -H "Content-Type: application/json" \
-  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
-  -d '{
-    "max_requests_per_minute": null
-  }'
+curl -s "$RUNTIME_API_URL/api/admin/api-keys" \
+  -H "Authorization: Bearer $TOKEN" | python3 -m json.tool
 ```
 
-## API Reference
+You should see output like:
+
+```json
+[
+  {
+    "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
+    "name": "default",
+    "key_value": "your-secret-key-value",
+    "max_runtimes": null,
+    "max_requests_per_minute": null,
+    "remaining_credits": null
+  }
+]
+```
 
-### CreateApiKeyRequest
+Copy the `id` value for the "default" key.
 
-| Field | Type | Default | Description |
-|-------|------|---------|-------------|
-| `name` | string | (required) | Unique identifier for the API key |
-| `max_runtimes` | integer | null | Maximum concurrent runtimes allowed |
-| `max_requests_per_minute` | integer | null | Rate limit (requests per minute) |
-| `initial_credits` | float | 0.0 | Starting credit balance |
-| `key_type` | string | "evaluation" | Key type: `"evaluation"` or `"openhands_app"` |
-| `is_test_key` | boolean | false | Mark as a test key |
+### Step 5: Update the Rate Limit
 
-### UpdateApiKeyRequest
+Set a rate limit on the key (replace `YOUR_KEY_ID` with the actual ID):
 
-| Field | Type | Description |
-|-------|------|-------------|
-| `name` | string | Update the key name |
-| `max_runtimes` | integer | Update max concurrent runtimes |
-| `max_requests_per_minute` | integer | Update rate limit |
-| `remaining_credits` | float | Update credit balance |
+```bash
+KEY_ID="a1b2c3d4-e5f6-7890-abcd-ef1234567890"  # Replace with your key ID
 
-### ApiKeyResponse
+curl -X PUT "$RUNTIME_API_URL/api/admin/api-keys/$KEY_ID" \
+  -H "Content-Type: application/json" \
+  -H "Authorization: Bearer $TOKEN" \
+  -d '{"max_requests_per_minute": 500}'
+```
 
-The API returns the full key configuration including the rate limit:
+A successful response returns the updated key configuration:
 
 ```json
 {
   "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
-  "name": "my-production-key",
-  "key_value": "ah-12345678-1234-1234-1234-123456789abc",
-  "max_runtimes": 10,
+  "name": "default",
+  "key_value": "your-secret-key-value",
+  "max_runtimes": null,
   "max_requests_per_minute": 500,
-  "remaining_credits": 100.0,
-  "key_type": "evaluation",
-  "is_test_key": false
+  "remaining_credits": null
 }
 ```
 
-## Rate Limit Responses
+### Removing a Rate Limit
 
-When a rate limit is exceeded, the API returns:
+To remove the rate limit and allow unlimited requests:
 
+```bash
+curl -X PUT "$RUNTIME_API_URL/api/admin/api-keys/$KEY_ID" \
+  -H "Content-Type: application/json" \
+  -H "Authorization: Bearer $TOKEN" \
+  -d '{"max_requests_per_minute": null}'
 ```
-HTTP/1.1 429 Too Many Requests
-Content-Type: application/json
 
-{
-  "detail": "Rate limit exceeded: 500 per 1 minute"
-}
-```
+## Choosing a Rate Limit Value
+
+The appropriate rate limit depends on your usage patterns:
+
+| Scenario | Suggested Limit |
+|----------|-----------------|
+| Small team (< 10 concurrent users) | 200-300 req/min |
+| Medium deployment (10-50 users) | 500-1000 req/min |
+| Large deployment or heavy automation | 1000+ req/min |
 
 <Warning>
-  Clients should implement exponential backoff when receiving 429 responses.
-  Continued requests during rate limiting may result in extended blocking.
+  Setting the limit too low can cause sandbox operations to fail with 429 errors.
+  Monitor your Runtime API logs after making changes.
 </Warning>
 
-## Logging
+## Troubleshooting
 
-When a rate limit is exceeded, the Runtime API logs a warning message:
+### Checking Current Rate Limit Status
 
+View the Runtime API logs to see rate limit events:
+
+```bash
+kubectl logs -l app.kubernetes.io/name=runtime-api -n openhands --tail=100 | grep -i "rate limit"
 ```
-Rate limit exceeded for {api_key_name} at {endpoint_path}
-```
 
-For example:
+When a rate limit is exceeded, you'll see messages like:
+
 ```
-Rate limit exceeded for my-production-key at /start
+Rate limit exceeded for default at /start
 ```
 
-This message is logged at the `WARNING` level. To capture these events, ensure your
-logging configuration includes `WARNING` level or higher for the Runtime API.
+### Common Issues
+
+**429 errors after setting a limit**: Your limit may be too low. Check the logs to see
+how many requests are being made, then adjust the limit accordingly.
+
+**Authentication failures**: JWT tokens expire after 24 hours. If you get 401 errors,
+repeat the authentication steps to get a new token.
+
+**"Admin functionality is disabled" error**: The `ADMIN_PASSWORD` environment variable
+may not be set in the Runtime API deployment. Check the deployment configuration.
 
 ## Related Configuration
 

From 939b8ecf062d21557e7aeccb3546b624c4d053dc Mon Sep 17 00:00:00 2001
From: openhands <openhands@all-hands.dev>
Date: Fri, 10 Apr 2026 20:27:19 +0000
Subject: [PATCH 07/11] docs: add troubleshooting for rate limits after upgrade

Added comprehensive troubleshooting section explaining:

- Rate limiting history: hardcoded 100 req/min in older versions vs
  configurable (no limit by default) in newer versions
- Chart version to image tag mapping table (0.2.8 = sha-1a920e8, etc.)
- Step-by-step diagnostic commands to check chart version, running image,
  and error message format
- How to identify old vs new rate limiting by error message format
- Upgrade instructions to get the new configurable rate limiting

This helps administrators who upgraded but are still seeing 429 errors
understand that they may be running an older image with hardcoded limits.

Co-authored-by: openhands <openhands@all-hands.dev>
---
 enterprise/k8s-install/rate-limits.mdx | 71 ++++++++++++++++++++++++++
 1 file changed, 71 insertions(+)

diff --git a/enterprise/k8s-install/rate-limits.mdx b/enterprise/k8s-install/rate-limits.mdx
index 5f3e293f4..fc027f48f 100644
--- a/enterprise/k8s-install/rate-limits.mdx
+++ b/enterprise/k8s-install/rate-limits.mdx
@@ -211,6 +211,77 @@ When a rate limit is exceeded, you'll see messages like:
 Rate limit exceeded for default at /start
 ```
 
+### Still Seeing Rate Limits After Upgrading?
+
+If you upgraded your deployment but are still experiencing 429 errors, the most likely
+cause is that you're running an older version of the Runtime API that has **hardcoded
+rate limits**.
+
+#### Background: Rate Limiting History
+
+Prior to Helm chart version **0.2.8**, the Runtime API had a hardcoded limit of
+**100 requests per minute** on all endpoints. This was not configurable — every
+deployment was subject to this limit regardless of settings.
+
+Starting with chart version **0.2.8** (image `sha-1a920e8`), rate limiting was changed to:
+- **No rate limit by default** — the internal API key is created without a limit
+- **Configurable per-key** — administrators can optionally set limits via the admin API
+
+| Chart Version | Image Tag | Rate Limiting Behavior |
+|---------------|-----------|------------------------|
+| 0.2.8 (latest) | `sha-1a920e8` | No limit by default, configurable |
+| 0.2.6 - 0.2.7 | `sha-7857be8` | No limit by default, configurable |
+| 0.2.1 - 0.2.5 | `sha-20ec8b3` | **Hardcoded 100 req/min** |
+| Earlier | Various | **Hardcoded 100 req/min** |
+
+#### Step 1: Check Your Chart Version
+
+```bash
+helm list -n openhands | grep runtime-api
+```
+
+If you're on a version older than 0.2.6, you need to upgrade to remove the hardcoded limits.
+
+#### Step 2: Check the Running Image
+
+Verify what image is actually running in your cluster:
+
+```bash
+kubectl get deployment -n openhands -l app.kubernetes.io/name=runtime-api \
+  -o jsonpath='{.items[*].spec.template.spec.containers[*].image}'
+```
+
+You should see `ghcr.io/openhands/runtime-api:sha-1a920e8` (or `sha-7857be8` or newer).
+
+If you see an older image tag (like `sha-20ec8b3` or earlier), you're running the old
+code with hardcoded limits.
+
+#### Step 3: Check the Error Message Format
+
+The error message format tells you which version of rate limiting is active:
+
+- **Old (hardcoded)**: `Rate limit exceeded` (generic message from slowapi library)
+- **New (configurable)**: `Rate limit exceeded: 500 per 1 minute` (includes the specific limit)
+
+If you see the old format, the new code isn't running yet.
+
+#### Step 4: Upgrade the Chart
+
+To get configurable rate limiting, upgrade to chart version 0.2.8 or later:
+
+```bash
+helm repo update
+helm upgrade runtime-api -n openhands \
+  oci://ghcr.io/all-hands-ai/helm-charts/runtime-api \
+  -f your-values.yaml
+```
+
+After upgrading, verify the new pods are running:
+
+```bash
+kubectl rollout status deployment -n openhands -l app.kubernetes.io/name=runtime-api
+```
+
 ### Common Issues
 
 **429 errors after setting a limit**: Your limit may be too low. Check the logs to see

From b77a488dc00fa98ccb6095032fccf6794bd41217 Mon Sep 17 00:00:00 2001
From: openhands <openhands@all-hands.dev>
Date: Fri, 10 Apr 2026 20:30:43 +0000
Subject: [PATCH 08/11] docs: replace step-by-step instructions with complete
 bash script

Replaced the manual step-by-step instructions with a single, well-commented
bash script (set-rate-limit.sh) that administrators can copy and run.

The script:
- Takes runtime API URL and rate limit as arguments
- Retrieves admin password from Kubernetes secret automatically
- Handles the PBKDF2 challenge-response authentication
- Finds the 'default' API key and shows its current rate limit
- Updates the rate limit and confirms the change
- Includes clear error messages at each step
- Uses extensive comments to explain what each section does

This makes it much easier for administrators to configure rate limits
without having to understand and execute each step manually.

Co-authored-by: openhands <openhands@all-hands.dev>
---
 enterprise/k8s-install/rate-limits.mdx | 312 +++++++++++++++++--------
 1 file changed, 221 insertions(+), 91 deletions(-)

diff --git a/enterprise/k8s-install/rate-limits.mdx b/enterprise/k8s-install/rate-limits.mdx
index fc027f48f..eb19c83a4 100644
--- a/enterprise/k8s-install/rate-limits.mdx
+++ b/enterprise/k8s-install/rate-limits.mdx
@@ -51,133 +51,263 @@ If `max_requests_per_minute` is not set (the default), no rate limiting is appli
 
 ## Configuring a Rate Limit
 
-To add a rate limit to your existing deployment, you'll need to:
-
-1. Get the admin password from your Kubernetes secrets
-2. Authenticate to the Runtime API admin interface
-3. Find your API key's ID
-4. Update the key with a rate limit
+We provide a script that handles all the steps: retrieving credentials from Kubernetes,
+authenticating to the Runtime API, and updating the rate limit.
 
 ### Prerequisites
 
-You'll need:
-- `kubectl` access to your OpenHands namespace
-- The Runtime API hostname (e.g., `runtimes.openhands.example.com`)
-- `curl` and `python3` installed locally
+Before running the script, ensure you have:
+
+- **kubectl** configured with access to your OpenHands namespace
+- **curl** installed
+- **python3** installed (used for JSON parsing and PBKDF2 hash computation)
+- The **Runtime API hostname** for your deployment (e.g., `runtimes.openhands.example.com`)
 
-### Step 1: Get the Admin Password
+### The Script
 
-The admin password is stored in a Kubernetes secret created during installation:
+Save this script as `set-rate-limit.sh` and make it executable with `chmod +x set-rate-limit.sh`:
 
 ```bash
-# Get the admin password
+#!/bin/bash
+#
+# set-rate-limit.sh
+#
+# Configure the rate limit for the internal API key used between
+# the OpenHands Server and the Runtime API.
+#
+# Usage:
+#   ./set-rate-limit.sh <runtime-api-url> <rate-limit>
+#
+# Examples:
+#   ./set-rate-limit.sh https://runtimes.example.com 500
+#   ./set-rate-limit.sh https://runtimes.example.com null    # Remove limit
+#
+# Prerequisites:
+#   - kubectl configured with access to the openhands namespace
+#   - curl and python3 installed
+#
+
+set -e
+
+# ==============================================================================
+# Parse command line arguments
+# ==============================================================================
+
+if [ $# -lt 2 ]; then
+    echo "Usage: $0 <runtime-api-url> <rate-limit>"
+    echo ""
+    echo "Arguments:"
+    echo "  runtime-api-url  The URL of your Runtime API (e.g., https://runtimes.example.com)"
+    echo "  rate-limit       Requests per minute (integer), or 'null' to remove the limit"
+    echo ""
+    echo "Examples:"
+    echo "  $0 https://runtimes.example.com 500"
+    echo "  $0 https://runtimes.example.com null"
+    exit 1
+fi
+
+RUNTIME_API_URL="$1"
+RATE_LIMIT="$2"
+
+# Validate rate limit is either a number or "null"
+if [ "$RATE_LIMIT" != "null" ] && ! [[ "$RATE_LIMIT" =~ ^[0-9]+$ ]]; then
+    echo "Error: rate-limit must be a positive integer or 'null'"
+    exit 1
+fi
+
+echo "Runtime API URL: $RUNTIME_API_URL"
+echo "Rate limit to set: $RATE_LIMIT"
+echo ""
+
+# ==============================================================================
+# Step 1: Retrieve the admin password from Kubernetes secrets
+# ==============================================================================
+
+echo "Step 1: Retrieving admin password from Kubernetes secret..."
+
+# The admin password was created during installation and stored in the
+# 'admin-password' secret in the openhands namespace
 ADMIN_PASSWORD=$(kubectl get secret admin-password -n openhands \
-  -o jsonpath='{.data.admin-password}' | base64 -d)
+    -o jsonpath='{.data.admin-password}' | base64 -d)
 
-# Verify you got it (should print a 32-character string)
-echo $ADMIN_PASSWORD
-```
+if [ -z "$ADMIN_PASSWORD" ]; then
+    echo "Error: Could not retrieve admin password from Kubernetes secret."
+    echo "Make sure the 'admin-password' secret exists in the 'openhands' namespace."
+    exit 1
+fi
 
-### Step 2: Set Your Runtime API URL
+echo "  ✓ Admin password retrieved"
 
-```bash
-# Replace with your actual Runtime API hostname
-RUNTIME_API_URL="https://runtimes.openhands.example.com"
-```
-
-### Step 3: Authenticate and Get a JWT Token
+# ==============================================================================
+# Step 2: Get a challenge from the Runtime API
+# ==============================================================================
 
-The Runtime API uses a challenge-response authentication flow. This script handles
-the PBKDF2 hash computation and login:
+echo "Step 2: Requesting authentication challenge..."
 
-```bash
-# Get a challenge from the server
+# The Runtime API uses a challenge-response protocol for admin authentication.
+# First, we request a challenge which includes a random UUID and a salt for
+# PBKDF2 hashing.
 CHALLENGE_RESPONSE=$(curl -s "$RUNTIME_API_URL/api/admin/challenge")
-CHALLENGE=$(echo $CHALLENGE_RESPONSE | python3 -c "import sys,json; print(json.load(sys.stdin)['challenge'])")
-SALT=$(echo $CHALLENGE_RESPONSE | python3 -c "import sys,json; print(json.load(sys.stdin)['salt'])")
 
-# Compute the PBKDF2 hash
+if [ -z "$CHALLENGE_RESPONSE" ] || echo "$CHALLENGE_RESPONSE" | grep -q "error"; then
+    echo "Error: Failed to get challenge from Runtime API"
+    echo "Response: $CHALLENGE_RESPONSE"
+    exit 1
+fi
+
+# Parse the challenge and salt from the JSON response
+CHALLENGE=$(echo "$CHALLENGE_RESPONSE" | python3 -c "import sys,json; print(json.load(sys.stdin)['challenge'])")
+SALT=$(echo "$CHALLENGE_RESPONSE" | python3 -c "import sys,json; print(json.load(sys.stdin)['salt'])")
+
+echo "  ✓ Challenge received"
+
+# ==============================================================================
+# Step 3: Compute the PBKDF2 hash and authenticate
+# ==============================================================================
+
+echo "Step 3: Authenticating with Runtime API..."
+
+# Compute the PBKDF2-SHA256 hash of the admin password.
+# The salt is: salt + challenge (concatenated as strings, then UTF-8 encoded)
+# Parameters: 10000 iterations, 32-byte output
 HASH=$(python3 -c "
 import hashlib, binascii
-password = '$ADMIN_PASSWORD'
-salt = '$SALT'
-challenge = '$CHALLENGE'
+password = '''$ADMIN_PASSWORD'''
+salt = '''$SALT'''
+challenge = '''$CHALLENGE'''
 combined_salt = (salt + challenge).encode('utf-8')
 dk = hashlib.pbkdf2_hmac('sha256', password.encode(), combined_salt, 10000, dklen=32)
 print(binascii.hexlify(dk).decode())
 ")
 
-# Login and get the JWT token
-TOKEN=$(curl -s -X POST "$RUNTIME_API_URL/api/admin/login" \
-  -H "Content-Type: application/json" \
-  -d "{\"challenge\": \"$CHALLENGE\", \"hash\": \"$HASH\"}" \
-  | python3 -c "import sys,json; print(json.load(sys.stdin)['token'])")
-
-# Verify you got a token (should print a long JWT string)
-echo $TOKEN
-```
-
-### Step 4: Find Your API Key ID
+# Submit the challenge and hash to get a JWT token
+LOGIN_RESPONSE=$(curl -s -X POST "$RUNTIME_API_URL/api/admin/login" \
+    -H "Content-Type: application/json" \
+    -d "{\"challenge\": \"$CHALLENGE\", \"hash\": \"$HASH\"}")
+
+if echo "$LOGIN_RESPONSE" | grep -q "error\|detail"; then
+    echo "Error: Authentication failed"
+    echo "Response: $LOGIN_RESPONSE"
+    exit 1
+fi
+
+# Extract the JWT token from the response
+TOKEN=$(echo "$LOGIN_RESPONSE" | python3 -c "import sys,json; print(json.load(sys.stdin)['token'])")
+
+echo "  ✓ Authentication successful"
+
+# ==============================================================================
+# Step 4: Find the default API key
+# ==============================================================================
+
+echo "Step 4: Finding the default API key..."
+
+# List all API keys and find the one named "default"
+# This is the key used for communication between OpenHands Server and Runtime API
+KEYS_RESPONSE=$(curl -s "$RUNTIME_API_URL/api/admin/api-keys" \
+    -H "Authorization: Bearer $TOKEN")
+
+# Find the ID of the key named "default"
+KEY_ID=$(echo "$KEYS_RESPONSE" | python3 -c "
+import sys, json
+keys = json.load(sys.stdin)
+for key in keys:
+    if key.get('name') == 'default':
+        print(key['id'])
+        break
+")
 
-List all API keys to find the ID of your "default" key:
+if [ -z "$KEY_ID" ]; then
+    echo "Error: Could not find API key named 'default'"
+    echo "Available keys:"
+    echo "$KEYS_RESPONSE" | python3 -m json.tool
+    exit 1
+fi
+
+# Show current rate limit
+CURRENT_LIMIT=$(echo "$KEYS_RESPONSE" | python3 -c "
+import sys, json
+keys = json.load(sys.stdin)
+for key in keys:
+    if key.get('name') == 'default':
+        limit = key.get('max_requests_per_minute')
+        print('unlimited' if limit is None else limit)
+        break
+")
 
-```bash
-curl -s "$RUNTIME_API_URL/api/admin/api-keys" \
-  -H "Authorization: Bearer $TOKEN" | python3 -m json.tool
-```
+echo "  ✓ Found default key (ID: $KEY_ID)"
+echo "  Current rate limit: $CURRENT_LIMIT"
+
+# ==============================================================================
+# Step 5: Update the rate limit
+# ==============================================================================
+
+echo "Step 5: Updating rate limit to $RATE_LIMIT..."
+
+# Update the API key with the new rate limit
+UPDATE_RESPONSE=$(curl -s -X PUT "$RUNTIME_API_URL/api/admin/api-keys/$KEY_ID" \
+    -H "Content-Type: application/json" \
+    -H "Authorization: Bearer $TOKEN" \
+    -d "{\"max_requests_per_minute\": $RATE_LIMIT}")
+
+if echo "$UPDATE_RESPONSE" | grep -q "error\|detail"; then
+    echo "Error: Failed to update rate limit"
+    echo "Response: $UPDATE_RESPONSE"
+    exit 1
+fi
+
+# Show the updated configuration
+NEW_LIMIT=$(echo "$UPDATE_RESPONSE" | python3 -c "
+import sys, json
+key = json.load(sys.stdin)
+limit = key.get('max_requests_per_minute')
+print('unlimited' if limit is None else limit)
+")
 
-You should see output like:
-
-```json
-[
-  {
-    "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
-    "name": "default",
-    "key_value": "your-secret-key-value",
-    "max_runtimes": null,
-    "max_requests_per_minute": null,
-    "remaining_credits": null
-  }
-]
+echo "  ✓ Rate limit updated successfully"
+echo ""
+echo "================================================"
+echo "Done! The default API key rate limit is now: $NEW_LIMIT"
+echo "================================================"
 ```
 
-Copy the `id` value for the "default" key.
+### Usage Examples
 
-### Step 5: Update the Rate Limit
-
-Set a rate limit on the key (replace `YOUR_KEY_ID` with the actual ID):
+**Set a rate limit of 500 requests per minute:**
 
 ```bash
-KEY_ID="a1b2c3d4-e5f6-7890-abcd-ef1234567890"  # Replace with your key ID
-
-curl -X PUT "$RUNTIME_API_URL/api/admin/api-keys/$KEY_ID" \
-  -H "Content-Type: application/json" \
-  -H "Authorization: Bearer $TOKEN" \
-  -d '{"max_requests_per_minute": 500}'
+./set-rate-limit.sh https://runtimes.openhands.example.com 500
 ```
 
-A successful response returns the updated key configuration:
-
-```json
-{
-  "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
-  "name": "default",
-  "key_value": "your-secret-key-value",
-  "max_runtimes": null,
-  "max_requests_per_minute": 500,
-  "remaining_credits": null
-}
+**Remove the rate limit (allow unlimited requests):**
+
+```bash
+./set-rate-limit.sh https://runtimes.openhands.example.com null
 ```
 
-### Removing a Rate Limit
+### Expected Output
 
-To remove the rate limit and allow unlimited requests:
+When successful, you'll see output like:
 
-```bash
-curl -X PUT "$RUNTIME_API_URL/api/admin/api-keys/$KEY_ID" \
-  -H "Content-Type: application/json" \
-  -H "Authorization: Bearer $TOKEN" \
-  -d '{"max_requests_per_minute": null}'
+```
+Runtime API URL: https://runtimes.openhands.example.com
+Rate limit to set: 500
+
+Step 1: Retrieving admin password from Kubernetes secret...
+  ✓ Admin password retrieved
+Step 2: Requesting authentication challenge...
+  ✓ Challenge received
+Step 3: Authenticating with Runtime API...
+  ✓ Authentication successful
+Step 4: Finding the default API key...
+  ✓ Found default key (ID: a1b2c3d4-e5f6-7890-abcd-ef1234567890)
+  Current rate limit: unlimited
+Step 5: Updating rate limit to 500...
+  ✓ Rate limit updated successfully
+
+================================================
+Done! The default API key rate limit is now: 500
+================================================
 ```
 
 ## Choosing a Rate Limit Value

From 87bc52cf819c03f1f39f4c101c42a8e27a5556e6 Mon Sep 17 00:00:00 2001
From: openhands <openhands@all-hands.dev>
Date: Fri, 10 Apr 2026 20:35:51 +0000
Subject: [PATCH 09/11] docs: add port-forward option for accessing Runtime API

Added 'Accessing the Runtime API' section explaining two options:

Option A: External URL - for deployments with runtime-api ingress enabled
Option B: Port-forward - for deployments without external ingress, using
  kubectl port-forward to svc/oh-main-runtime-api

This ensures the script works regardless of whether the Runtime API is
exposed externally or only accessible within the cluster.

Co-authored-by: openhands <openhands@all-hands.dev>
---
 enterprise/k8s-install/rate-limits.mdx | 32 +++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/enterprise/k8s-install/rate-limits.mdx b/enterprise/k8s-install/rate-limits.mdx
index eb19c83a4..a1491ce03 100644
--- a/enterprise/k8s-install/rate-limits.mdx
+++ b/enterprise/k8s-install/rate-limits.mdx
@@ -61,7 +61,37 @@ Before running the script, ensure you have:
 - **kubectl** configured with access to your OpenHands namespace
 - **curl** installed
 - **python3** installed (used for JSON parsing and PBKDF2 hash computation)
-- The **Runtime API hostname** for your deployment (e.g., `runtimes.openhands.example.com`)
+- The **Runtime API URL** — either:
+  - Your external Runtime API hostname (e.g., `https://runtimes.openhands.example.com`), or
+  - A port-forward to the internal service (see below)
+
+### Accessing the Runtime API
+
+**Option A: External URL (if ingress is enabled)**
+
+If you configured `runtime-api.ingress.enabled: true` in your Helm values, use your
+external hostname:
+
+```bash
+RUNTIME_API_URL="https://runtimes.openhands.example.com"
+```
+
+**Option B: Port-forward (if ingress is not enabled or for internal access)**
+
+If the Runtime API is not exposed externally, use kubectl port-forward:
+
+```bash
+# In a separate terminal, start the port-forward
+kubectl port-forward -n openhands svc/oh-main-runtime-api 5000:5000
+
+# Then use localhost as the URL
+RUNTIME_API_URL="http://localhost:5000"
+```
+
+<Info>
+  The port-forward must remain running while you execute the script. You can run it
+  in the background with `kubectl port-forward ... &` if preferred.
+</Info>
 
 ### The Script
 

From e59813acfbd75703ded449a3dcd6bf1f842d2560 Mon Sep 17 00:00:00 2001
From: openhands <openhands@all-hands.dev>
Date: Fri, 10 Apr 2026 20:39:09 +0000
Subject: [PATCH 10/11] docs: simplify script to use kubectl exec inside the
 pod

Rewrote the script to run commands inside the runtime-api pod via kubectl exec:

- No longer requires external Runtime API URL
- No need for curl or python3 installed locally
- Only prerequisite is kubectl access to the cluster
- Script finds the runtime-api pod automatically
- Runs Python inside the pod (which already has Python installed)
- Uses localhost:5000 to connect to the API from within the pod

This is much simpler for administrators since it works regardless of
whether the Runtime API has external ingress configured.

Co-authored-by: openhands <openhands@all-hands.dev>
---
 enterprise/k8s-install/rate-limits.mdx | 319 +++++++++++--------------
 1 file changed, 142 insertions(+), 177 deletions(-)

diff --git a/enterprise/k8s-install/rate-limits.mdx b/enterprise/k8s-install/rate-limits.mdx
index a1491ce03..b1825a1e7 100644
--- a/enterprise/k8s-install/rate-limits.mdx
+++ b/enterprise/k8s-install/rate-limits.mdx
@@ -59,39 +59,9 @@ authenticating to the Runtime API, and updating the rate limit.
 Before running the script, ensure you have:
 
 - **kubectl** configured with access to your OpenHands namespace
-- **curl** installed
-- **python3** installed (used for JSON parsing and PBKDF2 hash computation)
-- The **Runtime API URL** — either:
-  - Your external Runtime API hostname (e.g., `https://runtimes.openhands.example.com`), or
-  - A port-forward to the internal service (see below)
 
-### Accessing the Runtime API
-
-**Option A: External URL (if ingress is enabled)**
-
-If you configured `runtime-api.ingress.enabled: true` in your Helm values, use your
-external hostname:
-
-```bash
-RUNTIME_API_URL="https://runtimes.openhands.example.com"
-```
-
-**Option B: Port-forward (if ingress is not enabled or for internal access)**
-
-If the Runtime API is not exposed externally, use kubectl port-forward:
-
-```bash
-# In a separate terminal, start the port-forward
-kubectl port-forward -n openhands svc/oh-main-runtime-api 5000:5000
-
-# Then use localhost as the URL
-RUNTIME_API_URL="http://localhost:5000"
-```
-
-<Info>
-  The port-forward must remain running while you execute the script. You can run it
-  in the background with `kubectl port-forward ... &` if preferred.
-</Info>
+That's it! The script runs entirely via `kubectl exec` inside the cluster, so you don't
+need curl or python3 installed locally.
 
 ### The Script
 
@@ -105,39 +75,46 @@ Save this script as `set-rate-limit.sh` and make it executable with `chmod +x se
 # Configure the rate limit for the internal API key used between
 # the OpenHands Server and the Runtime API.
 #
+# This script runs commands inside the runtime-api pod using kubectl exec,
+# so it works regardless of whether the Runtime API is exposed externally.
+#
 # Usage:
-#   ./set-rate-limit.sh <runtime-api-url> <rate-limit>
+#   ./set-rate-limit.sh <rate-limit>
 #
 # Examples:
-#   ./set-rate-limit.sh https://runtimes.example.com 500
-#   ./set-rate-limit.sh https://runtimes.example.com null    # Remove limit
+#   ./set-rate-limit.sh 500       # Set limit to 500 requests per minute
+#   ./set-rate-limit.sh null      # Remove limit (allow unlimited)
 #
 # Prerequisites:
 #   - kubectl configured with access to the openhands namespace
-#   - curl and python3 installed
 #
 
 set -e
 
+# ==============================================================================
+# Configuration
+# ==============================================================================
+
+NAMESPACE="openhands"
+RUNTIME_API_URL="http://localhost:5000"  # Internal URL within the pod
+
 # ==============================================================================
 # Parse command line arguments
 # ==============================================================================
 
-if [ $# -lt 2 ]; then
-    echo "Usage: $0 <runtime-api-url> <rate-limit>"
+if [ $# -lt 1 ]; then
+    echo "Usage: $0 <rate-limit>"
     echo ""
     echo "Arguments:"
-    echo "  runtime-api-url  The URL of your Runtime API (e.g., https://runtimes.example.com)"
-    echo "  rate-limit       Requests per minute (integer), or 'null' to remove the limit"
+    echo "  rate-limit  Requests per minute (integer), or 'null' to remove the limit"
     echo ""
     echo "Examples:"
-    echo "  $0 https://runtimes.example.com 500"
-    echo "  $0 https://runtimes.example.com null"
+    echo "  $0 500      # Set limit to 500 requests per minute"
+    echo "  $0 null     # Remove limit (allow unlimited requests)"
     exit 1
 fi
 
-RUNTIME_API_URL="$1"
-RATE_LIMIT="$2"
+RATE_LIMIT="$1"
 
 # Validate rate limit is either a number or "null"
 if [ "$RATE_LIMIT" != "null" ] && ! [[ "$RATE_LIMIT" =~ ^[0-9]+$ ]]; then
@@ -145,160 +122,147 @@ if [ "$RATE_LIMIT" != "null" ] && ! [[ "$RATE_LIMIT" =~ ^[0-9]+$ ]]; then
     exit 1
 fi
 
-echo "Runtime API URL: $RUNTIME_API_URL"
 echo "Rate limit to set: $RATE_LIMIT"
 echo ""
 
 # ==============================================================================
-# Step 1: Retrieve the admin password from Kubernetes secrets
+# Step 1: Find the runtime-api pod
 # ==============================================================================
 
-echo "Step 1: Retrieving admin password from Kubernetes secret..."
+echo "Step 1: Finding runtime-api pod..."
 
-# The admin password was created during installation and stored in the
-# 'admin-password' secret in the openhands namespace
-ADMIN_PASSWORD=$(kubectl get secret admin-password -n openhands \
-    -o jsonpath='{.data.admin-password}' | base64 -d)
+# Get the name of a running runtime-api pod
+POD=$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/name=runtime-api \
+    -o jsonpath='{.items[0].metadata.name}' 2>/dev/null)
 
-if [ -z "$ADMIN_PASSWORD" ]; then
-    echo "Error: Could not retrieve admin password from Kubernetes secret."
-    echo "Make sure the 'admin-password' secret exists in the 'openhands' namespace."
+if [ -z "$POD" ]; then
+    echo "Error: Could not find a runtime-api pod in namespace '$NAMESPACE'"
+    echo "Make sure the runtime-api deployment is running."
     exit 1
 fi
 
-echo "  ✓ Admin password retrieved"
+echo "  ✓ Found pod: $POD"
 
 # ==============================================================================
-# Step 2: Get a challenge from the Runtime API
+# Step 2: Retrieve the admin password from Kubernetes secrets
 # ==============================================================================
 
-echo "Step 2: Requesting authentication challenge..."
+echo "Step 2: Retrieving admin password from Kubernetes secret..."
 
-# The Runtime API uses a challenge-response protocol for admin authentication.
-# First, we request a challenge which includes a random UUID and a salt for
-# PBKDF2 hashing.
-CHALLENGE_RESPONSE=$(curl -s "$RUNTIME_API_URL/api/admin/challenge")
+# The admin password was created during installation and stored in the
+# 'admin-password' secret in the openhands namespace
+ADMIN_PASSWORD=$(kubectl get secret admin-password -n "$NAMESPACE" \
+    -o jsonpath='{.data.admin-password}' | base64 -d)
 
-if [ -z "$CHALLENGE_RESPONSE" ] || echo "$CHALLENGE_RESPONSE" | grep -q "error"; then
-    echo "Error: Failed to get challenge from Runtime API"
-    echo "Response: $CHALLENGE_RESPONSE"
+if [ -z "$ADMIN_PASSWORD" ]; then
+    echo "Error: Could not retrieve admin password from Kubernetes secret."
+    echo "Make sure the 'admin-password' secret exists in the '$NAMESPACE' namespace."
     exit 1
 fi
 
-# Parse the challenge and salt from the JSON response
-CHALLENGE=$(echo "$CHALLENGE_RESPONSE" | python3 -c "import sys,json; print(json.load(sys.stdin)['challenge'])")
-SALT=$(echo "$CHALLENGE_RESPONSE" | python3 -c "import sys,json; print(json.load(sys.stdin)['salt'])")
-
-echo "  ✓ Challenge received"
+echo "  ✓ Admin password retrieved"
 
 # ==============================================================================
-# Step 3: Compute the PBKDF2 hash and authenticate
+# Step 3: Run the rate limit update inside the pod
 # ==============================================================================
 
-echo "Step 3: Authenticating with Runtime API..."
-
-# Compute the PBKDF2-SHA256 hash of the admin password.
+echo "Step 3: Connecting to runtime-api pod and updating rate limit..."
+
+# We'll execute a Python script inside the pod that:
+# 1. Gets a challenge from the local API
+# 2. Computes the PBKDF2 hash
+# 3. Authenticates and gets a JWT token
+# 4. Finds the default API key
+# 5. Updates its rate limit
+
+kubectl exec -n "$NAMESPACE" "$POD" -- python3 -c "
+import json
+import hashlib
+import binascii
+import urllib.request
+import urllib.error
+
+RUNTIME_API_URL = '$RUNTIME_API_URL'
+ADMIN_PASSWORD = '''$ADMIN_PASSWORD'''
+RATE_LIMIT = $RATE_LIMIT  # This will be an int or None (from 'null')
+
+def api_request(path, method='GET', data=None, token=None):
+    \"\"\"Make an HTTP request to the Runtime API.\"\"\"
+    url = f'{RUNTIME_API_URL}{path}'
+    headers = {'Content-Type': 'application/json'}
+    if token:
+        headers['Authorization'] = f'Bearer {token}'
+    
+    req = urllib.request.Request(url, method=method, headers=headers)
+    if data:
+        req.data = json.dumps(data).encode('utf-8')
+    
+    try:
+        with urllib.request.urlopen(req) as response:
+            return json.loads(response.read().decode('utf-8'))
+    except urllib.error.HTTPError as e:
+        error_body = e.read().decode('utf-8')
+        raise Exception(f'HTTP {e.code}: {error_body}')
+
+# Step 3a: Get authentication challenge
+print('  Getting authentication challenge...')
+challenge_resp = api_request('/api/admin/challenge')
+challenge = challenge_resp['challenge']
+salt = challenge_resp['salt']
+
+# Step 3b: Compute PBKDF2 hash
 # The salt is: salt + challenge (concatenated as strings, then UTF-8 encoded)
 # Parameters: 10000 iterations, 32-byte output
-HASH=$(python3 -c "
-import hashlib, binascii
-password = '''$ADMIN_PASSWORD'''
-salt = '''$SALT'''
-challenge = '''$CHALLENGE'''
 combined_salt = (salt + challenge).encode('utf-8')
-dk = hashlib.pbkdf2_hmac('sha256', password.encode(), combined_salt, 10000, dklen=32)
-print(binascii.hexlify(dk).decode())
-")
-
-# Submit the challenge and hash to get a JWT token
-LOGIN_RESPONSE=$(curl -s -X POST "$RUNTIME_API_URL/api/admin/login" \
-    -H "Content-Type: application/json" \
-    -d "{\"challenge\": \"$CHALLENGE\", \"hash\": \"$HASH\"}")
-
-if echo "$LOGIN_RESPONSE" | grep -q "error\|detail"; then
-    echo "Error: Authentication failed"
-    echo "Response: $LOGIN_RESPONSE"
-    exit 1
-fi
-
-# Extract the JWT token from the response
-TOKEN=$(echo "$LOGIN_RESPONSE" | python3 -c "import sys,json; print(json.load(sys.stdin)['token'])")
-
-echo "  ✓ Authentication successful"
-
-# ==============================================================================
-# Step 4: Find the default API key
-# ==============================================================================
-
-echo "Step 4: Finding the default API key..."
-
-# List all API keys and find the one named "default"
-# This is the key used for communication between OpenHands Server and Runtime API
-KEYS_RESPONSE=$(curl -s "$RUNTIME_API_URL/api/admin/api-keys" \
-    -H "Authorization: Bearer $TOKEN")
-
-# Find the ID of the key named "default"
-KEY_ID=$(echo "$KEYS_RESPONSE" | python3 -c "
-import sys, json
-keys = json.load(sys.stdin)
-for key in keys:
-    if key.get('name') == 'default':
-        print(key['id'])
-        break
-")
-
-if [ -z "$KEY_ID" ]; then
-    echo "Error: Could not find API key named 'default'"
-    echo "Available keys:"
-    echo "$KEYS_RESPONSE" | python3 -m json.tool
-    exit 1
-fi
-
-# Show current rate limit
-CURRENT_LIMIT=$(echo "$KEYS_RESPONSE" | python3 -c "
-import sys, json
-keys = json.load(sys.stdin)
+dk = hashlib.pbkdf2_hmac('sha256', ADMIN_PASSWORD.encode(), combined_salt, 10000, dklen=32)
+hash_hex = binascii.hexlify(dk).decode()
+
+# Step 3c: Authenticate and get JWT token
+print('  Authenticating...')
+login_resp = api_request('/api/admin/login', method='POST', data={
+    'challenge': challenge,
+    'hash': hash_hex
+})
+token = login_resp['token']
+print('  ✓ Authentication successful')
+
+# Step 3d: Get all API keys and find the 'default' key
+print('  Finding default API key...')
+keys = api_request('/api/admin/api-keys', token=token)
+
+default_key = None
 for key in keys:
     if key.get('name') == 'default':
-        limit = key.get('max_requests_per_minute')
-        print('unlimited' if limit is None else limit)
+        default_key = key
         break
-")
-
-echo "  ✓ Found default key (ID: $KEY_ID)"
-echo "  Current rate limit: $CURRENT_LIMIT"
-
-# ==============================================================================
-# Step 5: Update the rate limit
-# ==============================================================================
 
-echo "Step 5: Updating rate limit to $RATE_LIMIT..."
-
-# Update the API key with the new rate limit
-UPDATE_RESPONSE=$(curl -s -X PUT "$RUNTIME_API_URL/api/admin/api-keys/$KEY_ID" \
-    -H "Content-Type: application/json" \
-    -H "Authorization: Bearer $TOKEN" \
-    -d "{\"max_requests_per_minute\": $RATE_LIMIT}")
-
-if echo "$UPDATE_RESPONSE" | grep -q "error\|detail"; then
-    echo "Error: Failed to update rate limit"
-    echo "Response: $UPDATE_RESPONSE"
-    exit 1
-fi
-
-# Show the updated configuration
-NEW_LIMIT=$(echo "$UPDATE_RESPONSE" | python3 -c "
-import sys, json
-key = json.load(sys.stdin)
-limit = key.get('max_requests_per_minute')
-print('unlimited' if limit is None else limit)
-")
-
-echo "  ✓ Rate limit updated successfully"
-echo ""
-echo "================================================"
-echo "Done! The default API key rate limit is now: $NEW_LIMIT"
-echo "================================================"
+if not default_key:
+    print('  Error: Could not find API key named \"default\"')
+    print(f'  Available keys: {[k.get(\"name\") for k in keys]}')
+    exit(1)
+
+key_id = default_key['id']
+current_limit = default_key.get('max_requests_per_minute')
+current_display = 'unlimited' if current_limit is None else current_limit
+print(f'  ✓ Found default key (ID: {key_id})')
+print(f'  Current rate limit: {current_display}')
+
+# Step 3e: Update the rate limit
+new_display = 'unlimited' if RATE_LIMIT is None else RATE_LIMIT
+print(f'  Updating rate limit to {new_display}...')
+
+updated_key = api_request(f'/api/admin/api-keys/{key_id}', method='PUT', token=token, data={
+    'max_requests_per_minute': RATE_LIMIT
+})
+
+final_limit = updated_key.get('max_requests_per_minute')
+final_display = 'unlimited' if final_limit is None else final_limit
+print(f'  ✓ Rate limit updated successfully')
+print()
+print('================================================')
+print(f'Done! The default API key rate limit is now: {final_display}')
+print('================================================')
+"
 ```
 
 ### Usage Examples
@@ -306,13 +270,13 @@ echo "================================================"
 **Set a rate limit of 500 requests per minute:**
 
 ```bash
-./set-rate-limit.sh https://runtimes.openhands.example.com 500
+./set-rate-limit.sh 500
 ```
 
 **Remove the rate limit (allow unlimited requests):**
 
 ```bash
-./set-rate-limit.sh https://runtimes.openhands.example.com null
+./set-rate-limit.sh null
 ```
 
 ### Expected Output
@@ -320,19 +284,20 @@ echo "================================================"
 When successful, you'll see output like:
 
 ```
-Runtime API URL: https://runtimes.openhands.example.com
 Rate limit to set: 500
 
-Step 1: Retrieving admin password from Kubernetes secret...
+Step 1: Finding runtime-api pod...
+  ✓ Found pod: openhands-runtime-api-5d4f6b7c8d-x2k9m
+Step 2: Retrieving admin password from Kubernetes secret...
   ✓ Admin password retrieved
-Step 2: Requesting authentication challenge...
-  ✓ Challenge received
-Step 3: Authenticating with Runtime API...
+Step 3: Connecting to runtime-api pod and updating rate limit...
+  Getting authentication challenge...
+  Authenticating...
   ✓ Authentication successful
-Step 4: Finding the default API key...
-  ✓ Found default key (ID: a1b2c3d4-e5f6-7890-abcd-ef1234567890)
+  Finding default API key...
+  ✓ Found default key (ID: 1)
   Current rate limit: unlimited
-Step 5: Updating rate limit to 500...
+  Updating rate limit to 500...
   ✓ Rate limit updated successfully
 
 ================================================

From 5aec09342912cc3e92b6f66c99783176dc6db4d9 Mon Sep 17 00:00:00 2001
From: openhands <openhands@all-hands.dev>
Date: Fri, 10 Apr 2026 20:43:53 +0000
Subject: [PATCH 11/11] docs: add --check option to view current rate limit
 without changing

Added --check flag that allows administrators to inspect the current rate
limit configuration without making any changes. This is useful for:

- Verifying the current state before making changes
- Troubleshooting rate limit issues
- Confirming changes after an update

Usage:
  ./set-rate-limit.sh --check    # View current limit
  ./set-rate-limit.sh 500        # Set limit to 500
  ./set-rate-limit.sh null       # Remove limit

Co-authored-by: openhands <openhands@all-hands.dev>
---
 enterprise/k8s-install/rate-limits.mdx | 134 ++++++++++++++++++-------
 1 file changed, 99 insertions(+), 35 deletions(-)

diff --git a/enterprise/k8s-install/rate-limits.mdx b/enterprise/k8s-install/rate-limits.mdx
index b1825a1e7..8a2c86a03 100644
--- a/enterprise/k8s-install/rate-limits.mdx
+++ b/enterprise/k8s-install/rate-limits.mdx
@@ -72,18 +72,20 @@ Save this script as `set-rate-limit.sh` and make it executable with `chmod +x se
 #
 # set-rate-limit.sh
 #
-# Configure the rate limit for the internal API key used between
+# Configure or check the rate limit for the internal API key used between
 # the OpenHands Server and the Runtime API.
 #
 # This script runs commands inside the runtime-api pod using kubectl exec,
 # so it works regardless of whether the Runtime API is exposed externally.
 #
 # Usage:
-#   ./set-rate-limit.sh <rate-limit>
+#   ./set-rate-limit.sh --check         # Check current rate limit
+#   ./set-rate-limit.sh <rate-limit>    # Set rate limit
 #
 # Examples:
-#   ./set-rate-limit.sh 500       # Set limit to 500 requests per minute
-#   ./set-rate-limit.sh null      # Remove limit (allow unlimited)
+#   ./set-rate-limit.sh --check    # Display current rate limit
+#   ./set-rate-limit.sh 500        # Set limit to 500 requests per minute
+#   ./set-rate-limit.sh null       # Remove limit (allow unlimited)
 #
 # Prerequisites:
 #   - kubectl configured with access to the openhands namespace
@@ -103,26 +105,36 @@ RUNTIME_API_URL="http://localhost:5000"  # Internal URL within the pod
 # ==============================================================================
 
 if [ $# -lt 1 ]; then
-    echo "Usage: $0 <rate-limit>"
+    echo "Usage: $0 [--check | <rate-limit>]"
+    echo ""
+    echo "Options:"
+    echo "  --check     Display the current rate limit without changing it"
     echo ""
     echo "Arguments:"
     echo "  rate-limit  Requests per minute (integer), or 'null' to remove the limit"
     echo ""
     echo "Examples:"
-    echo "  $0 500      # Set limit to 500 requests per minute"
-    echo "  $0 null     # Remove limit (allow unlimited requests)"
+    echo "  $0 --check     # Check current rate limit"
+    echo "  $0 500         # Set limit to 500 requests per minute"
+    echo "  $0 null        # Remove limit (allow unlimited requests)"
     exit 1
 fi
 
-RATE_LIMIT="$1"
-
-# Validate rate limit is either a number or "null"
-if [ "$RATE_LIMIT" != "null" ] && ! [[ "$RATE_LIMIT" =~ ^[0-9]+$ ]]; then
-    echo "Error: rate-limit must be a positive integer or 'null'"
-    exit 1
+CHECK_ONLY=false
+RATE_LIMIT=""
+
+if [ "$1" == "--check" ]; then
+    CHECK_ONLY=true
+    echo "Checking current rate limit..."
+else
+    RATE_LIMIT="$1"
+    # Validate rate limit is either a number or "null"
+    if [ "$RATE_LIMIT" != "null" ] && ! [[ "$RATE_LIMIT" =~ ^[0-9]+$ ]]; then
+        echo "Error: rate-limit must be a positive integer or 'null'"
+        exit 1
+    fi
+    echo "Rate limit to set: $RATE_LIMIT"
 fi
-
-echo "Rate limit to set: $RATE_LIMIT"
 echo ""
 
 # ==============================================================================
@@ -166,14 +178,27 @@ echo "  ✓ Admin password retrieved"
 # Step 3: Run the rate limit update inside the pod
 # ==============================================================================
 
-echo "Step 3: Connecting to runtime-api pod and updating rate limit..."
+# Determine the action description for output
+if [ "$CHECK_ONLY" = true ]; then
+    echo "Step 3: Connecting to runtime-api pod and checking rate limit..."
+else
+    echo "Step 3: Connecting to runtime-api pod and updating rate limit..."
+fi
 
 # We'll execute a Python script inside the pod that:
 # 1. Gets a challenge from the local API
 # 2. Computes the PBKDF2 hash
 # 3. Authenticates and gets a JWT token
 # 4. Finds the default API key
-# 5. Updates its rate limit
+# 5. Optionally updates its rate limit (if not --check mode)
+
+# Pass CHECK_ONLY and RATE_LIMIT to the Python script
+# For check-only mode, we pass "CHECK" as the rate limit
+if [ "$CHECK_ONLY" = true ]; then
+    RATE_LIMIT_ARG="'CHECK'"
+else
+    RATE_LIMIT_ARG="$RATE_LIMIT"
+fi
 
 kubectl exec -n "$NAMESPACE" "$POD" -- python3 -c "
 import json
@@ -184,7 +209,10 @@ import urllib.error
 
 RUNTIME_API_URL = '$RUNTIME_API_URL'
 ADMIN_PASSWORD = '''$ADMIN_PASSWORD'''
-RATE_LIMIT = $RATE_LIMIT  # This will be an int or None (from 'null')
+RATE_LIMIT_ARG = $RATE_LIMIT_ARG  # This will be an int, None (from 'null'), or 'CHECK'
+
+CHECK_ONLY = RATE_LIMIT_ARG == 'CHECK'
+RATE_LIMIT = None if CHECK_ONLY else RATE_LIMIT_ARG
 
 def api_request(path, method='GET', data=None, token=None):
     \"\"\"Make an HTTP request to the Runtime API.\"\"\"
@@ -245,28 +273,39 @@ key_id = default_key['id']
 current_limit = default_key.get('max_requests_per_minute')
 current_display = 'unlimited' if current_limit is None else current_limit
 print(f'  ✓ Found default key (ID: {key_id})')
-print(f'  Current rate limit: {current_display}')
-
-# Step 3e: Update the rate limit
-new_display = 'unlimited' if RATE_LIMIT is None else RATE_LIMIT
-print(f'  Updating rate limit to {new_display}...')
-
-updated_key = api_request(f'/api/admin/api-keys/{key_id}', method='PUT', token=token, data={
-    'max_requests_per_minute': RATE_LIMIT
-})
-
-final_limit = updated_key.get('max_requests_per_minute')
-final_display = 'unlimited' if final_limit is None else final_limit
-print(f'  ✓ Rate limit updated successfully')
 print()
 print('================================================')
-print(f'Done! The default API key rate limit is now: {final_display}')
+print(f'Current rate limit: {current_display}')
 print('================================================')
+
+# Step 3e: Update the rate limit (only if not in check-only mode)
+if not CHECK_ONLY:
+    new_display = 'unlimited' if RATE_LIMIT is None else RATE_LIMIT
+    print()
+    print(f'  Updating rate limit to {new_display}...')
+    
+    updated_key = api_request(f'/api/admin/api-keys/{key_id}', method='PUT', token=token, data={
+        'max_requests_per_minute': RATE_LIMIT
+    })
+    
+    final_limit = updated_key.get('max_requests_per_minute')
+    final_display = 'unlimited' if final_limit is None else final_limit
+    print(f'  ✓ Rate limit updated successfully')
+    print()
+    print('================================================')
+    print(f'New rate limit: {final_display}')
+    print('================================================')
 "
 ```
 
 ### Usage Examples
 
+**Check the current rate limit:**
+
+```bash
+./set-rate-limit.sh --check
+```
+
 **Set a rate limit of 500 requests per minute:**
 
 ```bash
@@ -281,7 +320,28 @@ print('================================================')
 
 ### Expected Output
 
-When successful, you'll see output like:
+**Checking the current rate limit:**
+
+```
+Checking current rate limit...
+
+Step 1: Finding runtime-api pod...
+  ✓ Found pod: openhands-runtime-api-5d4f6b7c8d-x2k9m
+Step 2: Retrieving admin password from Kubernetes secret...
+  ✓ Admin password retrieved
+Step 3: Connecting to runtime-api pod and checking rate limit...
+  Getting authentication challenge...
+  Authenticating...
+  ✓ Authentication successful
+  Finding default API key...
+  ✓ Found default key (ID: 1)
+
+================================================
+Current rate limit: unlimited
+================================================
+```
+
+**Setting a rate limit:**
 
 ```
 Rate limit to set: 500
@@ -296,12 +356,16 @@ Step 3: Connecting to runtime-api pod and updating rate limit...
   ✓ Authentication successful
   Finding default API key...
   ✓ Found default key (ID: 1)
-  Current rate limit: unlimited
+
+================================================
+Current rate limit: unlimited
+================================================
+
   Updating rate limit to 500...
   ✓ Rate limit updated successfully
 
 ================================================
-Done! The default API key rate limit is now: 500
+New rate limit: 500
 ================================================
 ```