diff --git a/snippets/serverless-gpu-pricing-table.mdx b/snippets/serverless-gpu-pricing-table.mdx index 61e20cbb..3c2bef9b 100644 --- a/snippets/serverless-gpu-pricing-table.mdx +++ b/snippets/serverless-gpu-pricing-table.mdx @@ -1,11 +1,11 @@ -| **GPU type(s)** | **Memory** | **Flex cost per second** | **Active cost per second** | **Description** | -| --- | --- | --- | --- | --- | -| A4000, A4500, RTX 4000 | 16 GB | $0.00016 | $0.00011 | The most cost-effective for small models. | -| 4090 PRO | 24 GB | $0.00031 | $0.00021 | Extreme throughput for small-to-medium models. | -| L4, A5000, 3090 | 24 GB | $0.00019 | $0.00013 | Great for small-to-medium sized inference workloads. | -| L40, L40S, 6000 Ada PRO | 48 GB | $0.00053 | $0.00037 | Extreme inference throughput on LLMs like Llama 3 7B. | -| A6000, A40 | 48 GB | $0.00034 | $0.00024 | A cost-effective option for running big models. | -| H100 PRO | 80 GB | $0.00116 | $0.00093 | Extreme throughput for big models. | -| A100 | 80 GB | $0.00076 | $0.00060 | High throughput GPU, yet still very cost-effective. | -| H200 PRO | 141 GB | $0.00155 | $0.00124 | Extreme throughput for huge models. | -| B200 | 180 GB | $0.00240 | $0.00190 | Maximum throughput for huge models. | \ No newline at end of file +| **GPU type(s)** | **Memory** | **Cost per second** | **Description** | +| --- | --- | --- | --- | +| A4000, A4500, RTX 4000 | 16 GB | $0.00016 | The most cost-effective for small models. | +| 4090 PRO | 24 GB | $0.00031 | Extreme throughput for small-to-medium models. | +| L4, A5000, 3090 | 24 GB | $0.00019 | Great for small-to-medium sized inference workloads. | +| L40, L40S, 6000 Ada PRO | 48 GB | $0.00053 | Extreme inference throughput on LLMs like Llama 3 7B. | +| A6000, A40 | 48 GB | $0.00034 | A cost-effective option for running big models. | +| H100 PRO | 80 GB | $0.00116 | Extreme throughput for big models. | +| A100 | 80 GB | $0.00076 | High throughput GPU, yet still very cost-effective. | +| H200 PRO | 141 GB | $0.00155 | Extreme throughput for huge models. | +| B200 | 180 GB | $0.00240 | Maximum throughput for huge models. | \ No newline at end of file