From 6522c8e75a5bf1b5ab76cf5e4946406179e32216 Mon Sep 17 00:00:00 2001 From: "promptless[bot]" Date: Thu, 23 Apr 2026 22:29:56 +0000 Subject: [PATCH 1/2] Remove active worker discount mentions from serverless docs Updated documentation to remove all references to pricing differentials between active workers and flex workers, per product team request. --- release-notes.mdx | 2 +- serverless/development/optimization.mdx | 2 +- serverless/endpoints/endpoint-configurations.mdx | 2 +- serverless/pricing.mdx | 1 - serverless/workers/overview.mdx | 2 +- 5 files changed, 4 insertions(+), 5 deletions(-) diff --git a/release-notes.mdx b/release-notes.mdx index d30d4296..c9a20757 100644 --- a/release-notes.mdx +++ b/release-notes.mdx @@ -261,7 +261,7 @@ Flash now supports deploying endpoints to [multiple datacenters](/flash/configur - **Self-service worker upgrade**: Rebuild and roll workers from the dashboard without support tickets. - **Edit template from endpoint page**: Inline edit and redeploy the underlying template directly from the endpoint view. - **Improved Serverless metrics page**: Refinements to charts and filters for quicker root-cause analysis. -- [Flex and active workers](/serverless/pricing): Discounted always-on "active" capacity for baseline load with on-demand "flex" workers for bursts. +- [Flex and active workers](/serverless/pricing): Always-on "active" workers for baseline load with on-demand "flex" workers for bursts. - **Billing explorer**: Inspect costs by resource, region, and time to identify optimization opportunities. diff --git a/serverless/development/optimization.mdx b/serverless/development/optimization.mdx index 2997eee9..5b9b39fe 100644 --- a/serverless/development/optimization.mdx +++ b/serverless/development/optimization.mdx @@ -52,7 +52,7 @@ For private models, [embed them in your Docker image](/serverless/workers/create ### Maintain active workers -Set [active workers](/serverless/endpoints/endpoint-configurations#active-workers) > 0 to eliminate cold starts entirely. Active workers cost up to 30% less than flex workers. +Set [active workers](/serverless/endpoints/endpoint-configurations#active-workers) > 0 to eliminate cold starts entirely. **Formula**: `Active workers = (Requests/min × Request duration in seconds) / 60` diff --git a/serverless/endpoints/endpoint-configurations.mdx b/serverless/endpoints/endpoint-configurations.mdx index 6489caa5..e99692dc 100644 --- a/serverless/endpoints/endpoint-configurations.mdx +++ b/serverless/endpoints/endpoint-configurations.mdx @@ -55,7 +55,7 @@ For endpoints with fewer than five workers, all workers use the highest-priority ### Active workers -Minimum number of workers that remain warm and ready at all times. Setting this to 1+ eliminates cold starts. Active workers incur charges when idle but receive a 20-30% discount. +Minimum number of workers that remain warm and ready at all times. Setting this to 1+ eliminates cold starts. Active workers incur charges continuously, including when idle. ### Max workers diff --git a/serverless/pricing.mdx b/serverless/pricing.mdx index ae1ac737..dec52301 100644 --- a/serverless/pricing.mdx +++ b/serverless/pricing.mdx @@ -20,7 +20,6 @@ Serverless offers pay-per-second pricing with no upfront costs. You're billed fr | | Flex workers | Active workers | |---|--------------|----------------| | **Behavior** | Scale to zero when idle | Always running (24/7) | -| **Pricing** | Standard per-second rate | 20–30% discount | | **Best for** | Variable workloads, cost optimization | Consistent traffic, low-latency requirements | ## GPU pricing diff --git a/serverless/workers/overview.mdx b/serverless/workers/overview.mdx index 173994c1..a2d22639 100644 --- a/serverless/workers/overview.mdx +++ b/serverless/workers/overview.mdx @@ -39,7 +39,7 @@ To deploy workers with AI/ML models, follow this order of preference: Workers can run in two modes depending on your latency and cost requirements: -- **Active workers** run continuously (24/7) and are always ready to process requests instantly. They eliminate cold starts entirely and receive a discounted rate, making them ideal for latency-sensitive or high-traffic applications. +- **Active workers** run continuously (24/7) and are always ready to process requests instantly. They eliminate cold starts entirely, making them ideal for latency-sensitive or high-traffic applications. - **Flex workers** scale dynamically based on demand, spinning down to zero when idle. They incur cold starts when scaling up but cost nothing when not in use, making them ideal for variable or sporadic workloads. From c5846eac17657cf04ad9fc90383d96b295bb4210 Mon Sep 17 00:00:00 2001 From: "promptless[bot]" <179508745+promptless[bot]@users.noreply.github.com> Date: Sat, 2 May 2026 00:09:44 +0000 Subject: [PATCH 2/2] Update from greg.wester@runpod.io --- serverless/pricing.mdx | 1 + 1 file changed, 1 insertion(+) diff --git a/serverless/pricing.mdx b/serverless/pricing.mdx index dec52301..b5ff0105 100644 --- a/serverless/pricing.mdx +++ b/serverless/pricing.mdx @@ -20,6 +20,7 @@ Serverless offers pay-per-second pricing with no upfront costs. You're billed fr | | Flex workers | Active workers | |---|--------------|----------------| | **Behavior** | Scale to zero when idle | Always running (24/7) | +| **Pricing** | Standard per-second rate | Discounts available through sales inquiry | | **Best for** | Variable workloads, cost optimization | Consistent traffic, low-latency requirements | ## GPU pricing