Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 11 additions & 11 deletions snippets/serverless-gpu-pricing-table.mdx
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
| **GPU type(s)** | **Memory** | **Flex cost per second** | **Active cost per second** | **Description** |
| --- | --- | --- | --- | --- |
| A4000, A4500, RTX 4000 | 16 GB | $0.00016 | $0.00011 | The most cost-effective for small models. |
| 4090 PRO | 24 GB | $0.00031 | $0.00021 | Extreme throughput for small-to-medium models. |
| L4, A5000, 3090 | 24 GB | $0.00019 | $0.00013 | Great for small-to-medium sized inference workloads. |
| L40, L40S, 6000 Ada PRO | 48 GB | $0.00053 | $0.00037 | Extreme inference throughput on LLMs like Llama 3 7B. |
| A6000, A40 | 48 GB | $0.00034 | $0.00024 | A cost-effective option for running big models. |
| H100 PRO | 80 GB | $0.00116 | $0.00093 | Extreme throughput for big models. |
| A100 | 80 GB | $0.00076 | $0.00060 | High throughput GPU, yet still very cost-effective. |
| H200 PRO | 141 GB | $0.00155 | $0.00124 | Extreme throughput for huge models. |
| B200 | 180 GB | $0.00240 | $0.00190 | Maximum throughput for huge models. |
| **GPU type(s)** | **Memory** | **Cost per second** | **Description** |
| --- | --- | --- | --- |
| A4000, A4500, RTX 4000 | 16 GB | $0.00016 | The most cost-effective for small models. |
| 4090 PRO | 24 GB | $0.00031 | Extreme throughput for small-to-medium models. |
| L4, A5000, 3090 | 24 GB | $0.00019 | Great for small-to-medium sized inference workloads. |
| L40, L40S, 6000 Ada PRO | 48 GB | $0.00053 | Extreme inference throughput on LLMs like Llama 3 7B. |
| A6000, A40 | 48 GB | $0.00034 | A cost-effective option for running big models. |
| H100 PRO | 80 GB | $0.00116 | Extreme throughput for big models. |
| A100 | 80 GB | $0.00076 | High throughput GPU, yet still very cost-effective. |
| H200 PRO | 141 GB | $0.00155 | Extreme throughput for huge models. |
| B200 | 180 GB | $0.00240 | Maximum throughput for huge models. |
Loading