From d1accc4b93d95357e08b9034e2e502ef163786be Mon Sep 17 00:00:00 2001 From: "promptless[bot]" Date: Wed, 13 May 2026 18:13:36 +0000 Subject: [PATCH 1/2] Simplify serverless GPU pricing table to single price column RunPod has unified pricing for flex and active workers. Removes the separate "Flex cost per second" and "Active cost per second" columns, replacing them with a single "Cost per second" column. --- serverless/pricing.mdx | 3 ++- snippets/serverless-gpu-pricing-table.mdx | 22 +++++++++++----------- 2 files changed, 13 insertions(+), 12 deletions(-) diff --git a/serverless/pricing.mdx b/serverless/pricing.mdx index b5ff0105..2e55e117 100644 --- a/serverless/pricing.mdx +++ b/serverless/pricing.mdx @@ -20,9 +20,10 @@ Serverless offers pay-per-second pricing with no upfront costs. You're billed fr | | Flex workers | Active workers | |---|--------------|----------------| | **Behavior** | Scale to zero when idle | Always running (24/7) | -| **Pricing** | Standard per-second rate | Discounts available through sales inquiry | | **Best for** | Variable workloads, cost optimization | Consistent traffic, low-latency requirements | +Both worker types are billed at the same per-second rate. + ## GPU pricing diff --git a/snippets/serverless-gpu-pricing-table.mdx b/snippets/serverless-gpu-pricing-table.mdx index 61e20cbb..3c2bef9b 100644 --- a/snippets/serverless-gpu-pricing-table.mdx +++ b/snippets/serverless-gpu-pricing-table.mdx @@ -1,11 +1,11 @@ -| **GPU type(s)** | **Memory** | **Flex cost per second** | **Active cost per second** | **Description** | -| --- | --- | --- | --- | --- | -| A4000, A4500, RTX 4000 | 16 GB | $0.00016 | $0.00011 | The most cost-effective for small models. | -| 4090 PRO | 24 GB | $0.00031 | $0.00021 | Extreme throughput for small-to-medium models. | -| L4, A5000, 3090 | 24 GB | $0.00019 | $0.00013 | Great for small-to-medium sized inference workloads. | -| L40, L40S, 6000 Ada PRO | 48 GB | $0.00053 | $0.00037 | Extreme inference throughput on LLMs like Llama 3 7B. | -| A6000, A40 | 48 GB | $0.00034 | $0.00024 | A cost-effective option for running big models. | -| H100 PRO | 80 GB | $0.00116 | $0.00093 | Extreme throughput for big models. | -| A100 | 80 GB | $0.00076 | $0.00060 | High throughput GPU, yet still very cost-effective. | -| H200 PRO | 141 GB | $0.00155 | $0.00124 | Extreme throughput for huge models. | -| B200 | 180 GB | $0.00240 | $0.00190 | Maximum throughput for huge models. | \ No newline at end of file +| **GPU type(s)** | **Memory** | **Cost per second** | **Description** | +| --- | --- | --- | --- | +| A4000, A4500, RTX 4000 | 16 GB | $0.00016 | The most cost-effective for small models. | +| 4090 PRO | 24 GB | $0.00031 | Extreme throughput for small-to-medium models. | +| L4, A5000, 3090 | 24 GB | $0.00019 | Great for small-to-medium sized inference workloads. | +| L40, L40S, 6000 Ada PRO | 48 GB | $0.00053 | Extreme inference throughput on LLMs like Llama 3 7B. | +| A6000, A40 | 48 GB | $0.00034 | A cost-effective option for running big models. | +| H100 PRO | 80 GB | $0.00116 | Extreme throughput for big models. | +| A100 | 80 GB | $0.00076 | High throughput GPU, yet still very cost-effective. | +| H200 PRO | 141 GB | $0.00155 | Extreme throughput for huge models. | +| B200 | 180 GB | $0.00240 | Maximum throughput for huge models. | \ No newline at end of file From 8f533e5ef06baafc5a3aed9a909ba9cafb7bfee9 Mon Sep 17 00:00:00 2001 From: "promptless[bot]" <179508745+promptless[bot]@users.noreply.github.com> Date: Wed, 13 May 2026 20:52:29 +0000 Subject: [PATCH 2/2] Update from greg.wester@runpod.io --- serverless/pricing.mdx | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/serverless/pricing.mdx b/serverless/pricing.mdx index 2e55e117..b5ff0105 100644 --- a/serverless/pricing.mdx +++ b/serverless/pricing.mdx @@ -20,10 +20,9 @@ Serverless offers pay-per-second pricing with no upfront costs. You're billed fr | | Flex workers | Active workers | |---|--------------|----------------| | **Behavior** | Scale to zero when idle | Always running (24/7) | +| **Pricing** | Standard per-second rate | Discounts available through sales inquiry | | **Best for** | Variable workloads, cost optimization | Consistent traffic, low-latency requirements | -Both worker types are billed at the same per-second rate. - ## GPU pricing