Skip to content

chore(pricing): Update vertex-ai pricing#550

Open
siddharthsambharia-portkey wants to merge 13 commits intomainfrom
pricing-update/vertex-ai
Open

chore(pricing): Update vertex-ai pricing#550
siddharthsambharia-portkey wants to merge 13 commits intomainfrom
pricing-update/vertex-ai

Conversation

@siddharthsambharia-portkey
Copy link
Collaborator

@siddharthsambharia-portkey siddharthsambharia-portkey commented Mar 17, 2026

🔄 Pricing Update: vertex-ai

📊 Summary (complete_diff mode)

Change Type Count
➕ Models added 0
🔄 Models updated (merged) 19

🔄 Updated Models

  • gemini-2.5-pro
  • gemini-2.5-computer-use-preview-10-2025
  • gemini-2.0-flash-001
  • gemini-2.0-flash-lite-001
  • gemini-3-pro-preview
  • gemini-3-pro-image-preview
  • gemini-3-flash-preview
  • gemini-3.1-pro-preview
  • gemini-3.1-flash-image-preview
  • gemini-3.1-flash-lite-preview
  • veo-3.0-generate-001
  • veo-3.0-fast-generate-001
  • veo-3.0-generate-preview
  • veo-3.0-fast-generate-preview
  • veo-3.1-generate-001
  • veo-3.1-generate-preview
  • veo-3.1-fast-generate-preview
  • gemini-embedding-001
  • gemini-embedding-2-preview

Model-to-Pricing-Page Mapping

Google – Gemini (text/multimodal)

Model ID Publisher / Section Source Notes
gemini-2.5-pro Google – Gemini 2.5 Pro API Tiered pricing: ≤200K $1.25/$10, >200K $2.50/$15 — using standard tier
gemini-2.5-flash Google – Gemini 2.5 Flash API
gemini-2.5-flash-lite Google – Gemini 2.5 Flash Lite API
gemini-2.5-flash-image Google – Gemini 2.5 Flash Image API image_token $30/1M
gemini-2.5-flash-preview-09-2025 Google – Gemini 2.5 Flash API Preview alias → same pricing as gemini-2.5-flash
gemini-2.5-flash-lite-preview-09-2025 Google – Gemini 2.5 Flash Lite API Preview alias → same pricing as gemini-2.5-flash-lite
gemini-2.5-flash-image-preview Google – Gemini 2.5 Flash Image API Preview alias → same pricing as gemini-2.5-flash-image
gemini-2.5-computer-use-preview-10-2025 Google – Gemini 2.5 Pro Computer Use Preview API Matched to Gemini 2.5 Pro Computer Use-Preview section
gemini-2.0-flash-001 Google – Gemini 2.0 Flash API
gemini-2.0-flash-lite-001 Google – Gemini 2.0 Flash Lite API
gemini-3-pro-preview Google – Gemini 3 Pro Preview API image_token $120/1M; search $14/1000
gemini-3-pro-image-preview Google – Gemini 3 Pro Image API image_token $120/1M; search $14/1000
gemini-3-flash-preview Google – Gemini 3 Flash Preview API image_token $30/1M; search $14/1000
gemini-3.1-pro-preview Google – Gemini 3.1 Pro API image_token $120/1M; search $14/1000
gemini-3.1-flash-image-preview Google – Gemini 3.1 Flash Image Preview API image_token $60/1M; search $14/1000
gemini-3.1-flash-lite-preview Google – Gemini 3.1 Flash Lite Preview API search $14/1000

Excluded Google Gemini: gemini-live-2.5-flash-native-audio (live/streaming — excluded by policy)

Google – Imagen

Model ID Publisher / Section Source Notes
imagen-3.0-generate-002 Google – Imagen 3 API $0.04/image; matched via lookup_variant imagen-3.0-generate
imagen-3.0-capability-001 Google – Imagen 3 (capability) API $0.04/image; shares pricing with imagen-3.0-generate per reference
imagen-3.0-capability-002 Google – Imagen 3 (capability) API $0.04/image; shares pricing with imagen-3.0-generate per reference
imagen-4.0-generate-001 Google – Imagen 4 API $0.04/image; matched via lookup_variant imagen-4.0-generate
imagen-4.0-fast-generate-001 Google – Imagen 4 Fast API $0.02/image; matched via lookup_variant imagen-4.0-fast-generate
imagen-4.0-ultra-generate-001 Google – Imagen 4 Ultra API $0.06/image; matched via lookup_variant imagen-4.0-ultra-generate

Excluded Google Imagen: imagegeneration (legacy, excluded by google.md policy), virtual-try-on-001 (retail model, excluded)

Google – Veo

Model ID Publisher / Section Source Notes
veo-2.0-generate-001 Google – Veo 2 API $0.50/sec; 8s default duration
veo-3.0-generate-001 Google – Veo 3 API $0.40/sec; 8s default duration
veo-3.0-fast-generate-001 Google – Veo 3 Fast API $0.15/sec; 8s default duration
veo-3.0-generate-preview Google – Veo 3 API Preview alias → same pricing as veo-3.0-generate
veo-3.0-fast-generate-preview Google – Veo 3 Fast API Preview alias → same pricing as veo-3.0-fast-generate
veo-3.1-generate-001 Google – Veo 3.1 API $0.40/sec; 8s default duration
veo-3.1-fast-generate-001 Google – Veo 3.1 Fast API $0.15/sec; 8s default duration
veo-3.1-generate-preview Google – Veo 3.1 API Preview alias → same pricing as veo-3.1-generate
veo-3.1-fast-generate-preview Google – Veo 3.1 Fast API Preview alias → same pricing as veo-3.1-fast-generate

Google – Embedding

Model ID Publisher / Section Source Notes
text-embedding-005 Google – Text Embedding 005 API $0.000025/1K tokens
text-multilingual-embedding-002 Google – Text Multilingual Embedding 002 API $0.000025/1K tokens
gemini-embedding-001 Google – Gemini Embedding 001 API $0.000015/1K tokens
gemini-embedding-2-preview Google – Gemini Embedding 2 API $0.20/1K tokens
text-embedding-large-exp-03-07 Google – Text Embedding Large (experimental) API $0.000025/1K tokens; shares pricing with text-embedding-005 family
textembedding-gecko Google – Textembedding Gecko API $0.000025/1K tokens; legacy model
multimodalembedding Google – Multimodal Embedding API per-image $0.0001, per-video $0.001

Anthropic – Claude

Model ID Publisher / Section Source Notes
claude-opus-4-6 Anthropic – Claude Opus 4.6 API @default stripped; $5/$25 in/out
claude-sonnet-4-6 Anthropic – Claude Sonnet 4.6 API @default stripped; $3/$15 in/out
claude-opus-4@20250514 Anthropic – Claude Opus 4 API $15/$75 in/out
claude-sonnet-4@20250514 Anthropic – Claude Sonnet 4 API $3/$15 in/out
claude-opus-4-1@20250805 Anthropic – Claude Opus 4.1 API $15/$75 in/out
claude-sonnet-4-5@20250929 Anthropic – Claude Sonnet 4.5 API $3/$15 in/out
claude-haiku-4-5@20251001 Anthropic – Claude Haiku 4.5 API $1/$5 in/out
claude-opus-4-5@20251101 Anthropic – Claude Opus 4.5 API $5/$25 in/out

OpenAI

Model ID Publisher / Section Source Notes
gpt-oss-120b-maas OpenAI – gpt-oss-120b API $0.09/$0.36 in/out

Excluded OpenAI: clip-vit-base-patch32 (non-generative embedding), openclip (non-generative), whisper-large (audio transcription), gpt-oss (self-deploy only, no -maas)

Meta – Llama

Model ID Publisher / Section Source Notes
llama-3.3-70b-instruct-maas Meta – Llama 3.3 70B API $0.72/$0.72 in/out
llama-4-maverick-17b-128e-instruct-maas Meta – Llama 4 Maverick API $0.35/$1.15 in/out

Excluded Meta: faster-r-cnn, retinanet, mask-r-cnn, segment-anything, sam3 (non-generative CV); xlm-roberta-large, roberta-large (non-generative NLP); llama-guard, prompt-guard (guard models); codellama-7b-hf, llama2, llama-2-quantized, llama3, llama4, llama3_1, llama3-2, llama3-3, nllb, imagebind (self-deploy only)

Mistral AI

Model ID Publisher / Section Source Notes
mistral-small-2503 Mistral – Mistral Small 3.1 (25.03) API $0.10/$0.30 in/out
mistral-medium-3 Mistral – Mistral Medium 3 API $0.40/$2.00 in/out
codestral-2 Mistral – Codestral 2 API $0.30/$0.90 in/out

Excluded Mistral: mistral (self-deploy, publisher mistral-ai), mixtral (self-deploy), codestral-2501-self-deploy (self-deploy in name), mistral-ocr-2505 (OCR), ministral-3 (self-deploy), mistral-large-3 (self-deploy)

DeepSeek

Model ID Publisher / Section Source Notes
deepseek-r1-0528-maas DeepSeek – DeepSeek-R1 (0528) API $1.35/$5.40 in/out
deepseek-v3.1-maas DeepSeek – DeepSeek-V3.1 API $0.60/$1.70 in/out
deepseek-v3.2-maas DeepSeek – DeepSeek-V3.2 API $0.56/$1.68 in/out

Excluded DeepSeek: deepseek-r1, deepseek-v3, deepseek-v3-1, deepseek-v3-2 (self-deploy); deepseek-ocr, deepseek-ocr-2 (OCR + self-deploy); deepseek-ocr-maas (OCR)

Qwen

Model ID Publisher / Section Source Notes
qwen3-235b-a22b-instruct-2507-maas Qwen – Qwen3-235B-A22B-Instruct-2507 API $0.22/$0.88 in/out
qwen3-coder-480b-a35b-instruct-maas Qwen – Qwen3-Coder-480B-A35B-Instruct API $0.22/$1.80 in/out
qwen3-next-80b-a3b-instruct-maas Qwen – Qwen3-Next-80B-Instruct API $0.15/$1.20 in/out
qwen3-next-80b-a3b-thinking-maas Qwen – Qwen3-Next-80B-Thinking API $0.15/$1.20 in/out

Excluded Qwen: qwq, qwen3, qwen3-embedding, qwen3-5, qwen2, qwen3-coder-next, qwen3-coder, qwen3-next, qwen3-vl (self-deploy); qwen-image (explicit policy exception)

Kimi / Moonshot

Model ID Publisher / Section Source Notes
kimi-k2-thinking-maas Kimi – Kimi-K2-Thinking API $0.60/$2.50 in/out

Excluded Kimi: kimi-k2-5, kimi-k2 (self-deploy)

MiniMax

Model ID Publisher / Section Source Notes
minimax-m2-maas MiniMax – MiniMax-M2 API $0.30/$1.20 in/out

Excluded MiniMax: minimax-m2 (self-deploy)

ZAI.org / GLM

Model ID Publisher / Section Source Notes
glm-4.7-maas ZAI.org – GLM-4.7 API $0.60/$2.20 in/out
glm-5-maas ZAI.org – GLM-5 API $1.00/$3.20 in/out

Excluded ZAI.org: glm-4.7, glm-5, glm-4.5 (self-deploy); glm-ocr (OCR + self-deploy); glm-image (explicit policy exception)

AI21

Excluded AI21: jamba-large-1.6 (self-deploy only, has_deploy: true, no -maas suffix — no MaaS inference endpoint on pricing page)

Summary Statistics

  • Total models submitted: 63
  • Google: 16 Gemini + 6 Imagen + 9 Veo + 7 Embedding = 38
  • Anthropic: 8 Claude
  • OpenAI: 1
  • Meta: 2
  • Mistral: 3
  • DeepSeek: 3
  • Qwen: 4
  • Kimi: 1
  • MiniMax: 1
  • ZAI.org: 2

Generated by Pricing Agent on 2026-03-22

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant