From b9e9fa9e16e6a0889f6053a011c4859d82e69097 Mon Sep 17 00:00:00 2001 From: Ming Lu Date: Thu, 14 May 2026 12:12:10 -0700 Subject: [PATCH 1/5] [AI Gateway] Add Workers AI REST API docs and migrate examples to new endpoint - Add new rest-api.mdx page documenting /ai/run, /ai/v1/chat/completions, and /ai/v1/responses endpoints for calling any model via the Cloudflare API - Deprecate /compat/chat/completions endpoint on chat-completion.mdx - Update get-started, unified-billing, caching, authentication, request-handling, logging, custom-metadata, and workersai pages to use api.cloudflare.com examples instead of gateway.ai.cloudflare.com - Update chat-completions-providers partial (affects all provider pages) - Delete redundant deploy-aig-worker tutorial - Update supported-models and worker-binding-methods references --- public/__redirects | 1 + .../configuration/authentication.mdx | 41 ++- .../configuration/manage-gateway.mdx | 6 +- .../configuration/request-handling.mdx | 15 +- .../docs/ai-gateway/features/caching.mdx | 81 +++--- .../ai-gateway/features/unified-billing.mdx | 39 +-- src/content/docs/ai-gateway/get-started.mdx | 47 ++-- .../integrations/worker-binding-methods.mdx | 2 +- .../observability/custom-metadata.mdx | 26 +- .../observability/logging/index.mdx | 50 ++-- .../docs/ai-gateway/supported-models.mdx | 2 +- .../tutorials/create-first-aig-workers.mdx | 16 +- .../tutorials/deploy-aig-worker.mdx | 148 ----------- .../docs/ai-gateway/usage/chat-completion.mdx | 5 + .../ai-gateway/usage/providers/workersai.mdx | 105 ++------ .../docs/ai-gateway/usage/rest-api.mdx | 239 ++++++++++++++++++ .../ai-gateway/chat-completions-providers.mdx | 4 +- 17 files changed, 425 insertions(+), 402 deletions(-) delete mode 100644 src/content/docs/ai-gateway/tutorials/deploy-aig-worker.mdx create mode 100644 src/content/docs/ai-gateway/usage/rest-api.mdx diff --git a/public/__redirects b/public/__redirects index 5b271e8e19f7c51..fcad8d8e75b8bad 100644 --- a/public/__redirects +++ b/public/__redirects @@ -242,6 +242,7 @@ /ai-gateway/universal/ /ai-gateway/usage/universal/ 301 /ai-gateway/chat-completion/ /ai-gateway/usage/chat-completion/ 301 /ai-gateway/legacy-models/ /ai-gateway/supported-models/ 301 +/ai-gateway/tutorials/deploy-aig-worker/ /ai-gateway/integrations/aig-workers-ai-binding/ 301 # analytics /analytics/migration-guides/zone-analytics/ /analytics/graphql-api/migration-guides/zone-analytics/ 301 diff --git a/src/content/docs/ai-gateway/configuration/authentication.mdx b/src/content/docs/ai-gateway/configuration/authentication.mdx index 51d6ad6ec73afd3..c09a871a18cd11f 100644 --- a/src/content/docs/ai-gateway/configuration/authentication.mdx +++ b/src/content/docs/ai-gateway/configuration/authentication.mdx @@ -9,29 +9,24 @@ products: - ai-gateway --- -Using an Authenticated Gateway in AI Gateway adds security by requiring a valid authorization token for each request. This feature is especially useful when storing logs, as it prevents unauthorized access and protects against invalid requests that can inflate log storage usage and make it harder to find the data you need. With Authenticated Gateway enabled, only requests with the correct token are processed. +AI Gateway requires a valid Cloudflare API token for each request. This prevents unauthorized access and protects against invalid requests that can inflate log storage usage. -:::note -We recommend enabling Authenticated Gateway when opting to store logs with AI Gateway. - -If Authenticated Gateway is enabled but a request does not include the required `cf-aig-authorization` header, the request will fail. This setting ensures that only verified requests pass through the gateway. To bypass the need for the `cf-aig-authorization` header, make sure to disable Authenticated Gateway. -::: +When using the [Workers AI REST API](/ai-gateway/usage/rest-api/), pass your Cloudflare API token in the standard `Authorization` header. When using [provider-native endpoints](/ai-gateway/usage/providers/) at `gateway.ai.cloudflare.com`, use the `cf-aig-authorization` header instead. -## Setting up Authenticated Gateway using the Dashboard +## Setting up Authenticated Gateway using the dashboard 1. Go to the Settings for the specific gateway you want to enable authentication for. 2. Select **Create authentication token** to generate a custom token with the required `Run` permissions. Be sure to securely save this token, as it will not be displayed again. 3. Include the `cf-aig-authorization` header with your API token in each request for this gateway. 4. Return to the settings page and toggle on Authenticated Gateway. -## Example requests with OpenAI +## Example requests ```bash -curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \ - --header 'cf-aig-authorization: Bearer {CF_AIG_TOKEN}' \ - --header 'Authorization: Bearer OPENAI_TOKEN' \ - --header 'Content-Type: application/json' \ - --data '{"model": "gpt-5-mini", "messages": [{"role": "user", "content": "What is Cloudflare?"}]}' +curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ + --data '{"model": "openai/gpt-4.1-mini", "messages": [{"role": "user", "content": "What is Cloudflare?"}]}' ``` Using the OpenAI SDK: @@ -40,24 +35,24 @@ Using the OpenAI SDK: import OpenAI from "openai"; const openai = new OpenAI({ - apiKey: process.env.OPENAI_API_KEY, - baseURL: "https://gateway.ai.cloudflare.com/v1/account-id/gateway/openai", - defaultHeaders: { - "cf-aig-authorization": `Bearer {token}`, - }, + apiKey: CLOUDFLARE_API_TOKEN, + baseURL: `https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/ai/v1`, +}); + +const response = await openai.chat.completions.create({ + model: "openai/gpt-4.1-mini", + messages: [{ role: "user", content: "What is Cloudflare?" }], }); ``` -## Example requests with the Vercel AI SDK +Using the Vercel AI SDK: ```javascript import { createOpenAI } from "@ai-sdk/openai"; const openai = createOpenAI({ - baseURL: "https://gateway.ai.cloudflare.com/v1/account-id/gateway/openai", - headers: { - "cf-aig-authorization": `Bearer {token}`, - }, + apiKey: CLOUDFLARE_API_TOKEN, + baseURL: `https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/ai/v1`, }); ``` diff --git a/src/content/docs/ai-gateway/configuration/manage-gateway.mdx b/src/content/docs/ai-gateway/configuration/manage-gateway.mdx index a60d951cde860e3..69edd04eb8e13a4 100644 --- a/src/content/docs/ai-gateway/configuration/manage-gateway.mdx +++ b/src/content/docs/ai-gateway/configuration/manage-gateway.mdx @@ -16,9 +16,11 @@ You have several different options for managing an AI Gateway. ### Default gateway -AI Gateway can automatically create a gateway for you. When you use `default` as a gateway ID and no gateway with that ID exists in your account, AI Gateway creates it on the first authenticated request. +AI Gateway can automatically create a gateway for you. If you omit the gateway ID from your request entirely, AI Gateway defaults to using `default` as the gateway ID. When no gateway named `default` exists in your account, AI Gateway creates it on the first authenticated request. -The request that triggers auto-creation must include a valid `cf-aig-authorization` header. An unauthenticated request to a `default` gateway that does not yet exist does not create the gateway. +This means you can start sending requests without creating a gateway first — AI Gateway handles gateway creation for you. + +The request that triggers auto-creation must include a valid `cf-aig-authorization` header. An unauthenticated request to a `default` gateway that does not yet exist does not create the gateway. For Workers AI bindings, the account identity from the binding is used instead of the header. The auto-created default gateway uses the following settings: diff --git a/src/content/docs/ai-gateway/configuration/request-handling.mdx b/src/content/docs/ai-gateway/configuration/request-handling.mdx index 41a2978a60439fb..cbf15ed3ec3d54d 100644 --- a/src/content/docs/ai-gateway/configuration/request-handling.mdx +++ b/src/content/docs/ai-gateway/configuration/request-handling.mdx @@ -33,12 +33,15 @@ A timeout is set in milliseconds. The timeout is based on when the first part of For a provider-specific endpoint, configure the timeout value by adding a `cf-aig-request-timeout` header. -```bash title="Provider-specific endpoint example" {4} -curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/@cf/meta/llama-3.1-8b-instruct \ - --header 'Authorization: Bearer {cf_api_token}' \ - --header 'Content-Type: application/json' \ - --header 'cf-aig-request-timeout: 5000' - --data '{"prompt": "What is Cloudflare?"}' +```bash title="Request with timeout" {4} +curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ + --header "cf-aig-request-timeout: 5000" \ + --data '{ + "model": "openai/gpt-4.1-mini", + "messages": [{"role": "user", "content": "What is Cloudflare?"}] + }' ``` --- diff --git a/src/content/docs/ai-gateway/features/caching.mdx b/src/content/docs/ai-gateway/features/caching.mdx index 4685c84384205b1..b2fd80c490f213f 100644 --- a/src/content/docs/ai-gateway/features/caching.mdx +++ b/src/content/docs/ai-gateway/features/caching.mdx @@ -96,20 +96,19 @@ You can use the header **cf-aig-skip-cache** to bypass the cached version of the As an example, when submitting a request to OpenAI, include the header in the following manner: ```bash title="Request skipping the cache" -curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \ - --header "Authorization: Bearer $TOKEN" \ - --header 'Content-Type: application/json' \ - --header 'cf-aig-skip-cache: true' \ - --data ' { - "model": "gpt-4o-mini", - "messages": [ - { - "role": "user", - "content": "how to build a wooden spoon in 3 short steps? give as short as answer as possible" - } - ] - } -' +curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ + --header "cf-aig-skip-cache: true" \ + --data '{ + "model": "openai/gpt-4.1-mini", + "messages": [ + { + "role": "user", + "content": "how to build a wooden spoon in 3 short steps? give as short as answer as possible" + } + ] + }' ``` ### Cache TTL (cf-aig-cache-ttl) @@ -121,20 +120,19 @@ For example, if you set a TTL of one hour, it means that a request is kept in th As an example, when submitting a request to OpenAI, include the header in the following manner: ```bash title="Request to be cached for an hour" -curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \ - --header "Authorization: Bearer $TOKEN" \ - --header 'Content-Type: application/json' \ - --header 'cf-aig-cache-ttl: 3600' \ - --data ' { - "model": "gpt-4o-mini", - "messages": [ - { - "role": "user", - "content": "how to build a wooden spoon in 3 short steps? give as short as answer as possible" - } - ] - } -' +curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ + --header "cf-aig-cache-ttl: 3600" \ + --data '{ + "model": "openai/gpt-4.1-mini", + "messages": [ + { + "role": "user", + "content": "how to build a wooden spoon in 3 short steps? give as short as answer as possible" + } + ] + }' ``` ### Custom cache key (cf-aig-cache-key) @@ -146,20 +144,19 @@ When you use the **cf-aig-cache-key** header for the first time, you will receiv As an example, when submitting a request to OpenAI, include the header in the following manner: ```bash title="Request with custom cache key" -curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \ - --header 'Authorization: Bearer {openai_token}' \ - --header 'Content-Type: application/json' \ - --header 'cf-aig-cache-key: responseA' \ - --data ' { - "model": "gpt-4o-mini", - "messages": [ - { - "role": "user", - "content": "how to build a wooden spoon in 3 short steps? give as short as answer as possible" - } - ] - } -' +curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ + --header "cf-aig-cache-key: responseA" \ + --data '{ + "model": "openai/gpt-4.1-mini", + "messages": [ + { + "role": "user", + "content": "how to build a wooden spoon in 3 short steps? give as short as answer as possible" + } + ] + }' ``` :::caution[AI Gateway caching behavior] diff --git a/src/content/docs/ai-gateway/features/unified-billing.mdx b/src/content/docs/ai-gateway/features/unified-billing.mdx index 1ce5e727f59285d..05189a24d212f92 100644 --- a/src/content/docs/ai-gateway/features/unified-billing.mdx +++ b/src/content/docs/ai-gateway/features/unified-billing.mdx @@ -77,26 +77,29 @@ Refer to the [binding reference](/ai-gateway/integrations/worker-binding-methods ### HTTP API -Call a supported provider through the AI Gateway REST API without passing a provider API key. Use the `cf-aig-authorization` header to authenticate with your Cloudflare API token. +Call a supported provider through the AI Gateway REST API without passing a provider API key. + +#### Workers AI REST API + +Use the Cloudflare API to call third-party models. Pass your Cloudflare API token in the `Authorization` header: ```bash -curl -X POST https://gateway.ai.cloudflare.com/v1/$CLOUDFLARE_ACCOUNT_ID/default/compat/chat/completions \ - --header "cf-aig-authorization: Bearer $CLOUDFLARE_API_TOKEN" \ - --header 'Content-Type: application/json' \ +curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ --data '{ - "model": "google-ai-studio/gemini-2.5-pro", - "messages": [ - { - "role": "user", - "content": "What is Cloudflare?" - } - ] + "model": "openai/gpt-4.1-mini", + "messages": [{"role": "user", "content": "What is Cloudflare?"}] }' ``` -The `default` gateway is created automatically on your first request. Replace `default` with a specific gateway ID if you have already created one. +Refer to [Workers AI REST API](/ai-gateway/usage/rest-api/) for more details on all available endpoints. + +#### AI Gateway provider-native endpoints + +You can also call providers directly through [provider-native endpoints](/ai-gateway/usage/providers/) using the `cf-aig-authorization` header to authenticate: -The HTTP API supports the following providers through [provider-native endpoints](/ai-gateway/usage/providers/) and the [Unified API (chat completions)](/ai-gateway/usage/chat-completion/): +The HTTP API supports the following providers: - [OpenAI](/ai-gateway/usage/providers/openai/) - [Anthropic](/ai-gateway/usage/providers/anthropic/) @@ -150,12 +153,12 @@ To set ZDR as the default for Unified Billing using the API: Use the `cf-aig-zdr` header to override the gateway default for a single Unified Billing request. Set it to `true` to force ZDR, or `false` to disable ZDR for the request. ```bash title="Unified Billing request with ZDR" -curl -X POST https://gateway.ai.cloudflare.com/v1/$CLOUDFLARE_ACCOUNT_ID/{gateway_id}/openai/chat/completions \ - --header "cf-aig-authorization: Bearer $CLOUDFLARE_API_TOKEN" \ - --header 'Content-Type: application/json' \ - --header 'cf-aig-zdr: true' \ +curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ + --header "cf-aig-zdr: true" \ --data '{ - "model": "gpt-4o-mini", + "model": "openai/gpt-4.1-mini", "messages": [ { "role": "user", diff --git a/src/content/docs/ai-gateway/get-started.mdx b/src/content/docs/ai-gateway/get-started.mdx index 9c627dbd3c37ead..d69e266dfd5348b 100644 --- a/src/content/docs/ai-gateway/get-started.mdx +++ b/src/content/docs/ai-gateway/get-started.mdx @@ -16,9 +16,7 @@ import { Render, TabItem, Tabs, - Badge, } from "~/components"; -import CodeSnippets from "~/components/ai-gateway/code-examples.astro"; In this guide, you will learn how to set up and use your first AI Gateway. @@ -34,11 +32,12 @@ Before making requests, you need two things: Run the following command to make your first request through AI Gateway: ```bash -curl -X POST https://gateway.ai.cloudflare.com/v1/$CLOUDFLARE_ACCOUNT_ID/default/compat/chat/completions \ - --header "cf-aig-authorization: Bearer $CLOUDFLARE_API_TOKEN" \ - --header 'Content-Type: application/json' \ +curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ --data '{ - "model": "workers-ai/@cf/meta/llama-3.3-70b-instruct-fp8-fast", + "model": "moonshotai/kimi-k2.6", + "provider": "cloudflare", "messages": [ { "role": "user", @@ -49,7 +48,7 @@ curl -X POST https://gateway.ai.cloudflare.com/v1/$CLOUDFLARE_ACCOUNT_ID/default ``` :::note -AI Gateway automatically creates a gateway for you on the first request. The gateway is created with [authentication](/ai-gateway/configuration/authentication/) turned on, so the `cf-aig-authorization` header is required for all requests. For more details on how the default gateway works, refer to [Default gateway](/ai-gateway/configuration/manage-gateway/#default-gateway). +You do not need to create a gateway before sending requests. When no gateway ID is specified, AI Gateway uses `default` as the gateway ID and automatically creates it on the first authenticated request. To use a specific gateway, add the `cf-aig-gateway-id` header to your request. For more details, refer to [Default gateway](/ai-gateway/configuration/manage-gateway/#default-gateway). :::
@@ -70,31 +69,21 @@ Authenticate with your upstream AI provider using one of the following options: ## Integration options -### Unified API Endpoint +### Workers AI REST API - -
-
- -The easiest way to get started with AI Gateway is through our OpenAI-compatible `/chat/completions` endpoint. This allows you to use existing OpenAI SDKs and tools with minimal code changes while gaining access to multiple AI providers. - -` -https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat/chat/completions -` - -**Key benefits:** - -- Drop-in replacement for OpenAI API, works with existing OpenAI SDKs and other OpenAI compliant clients -- Switch between providers by changing the `model` parameter -- Dynamic Routing - Define complex routing scenarios requiring conditional logic, conduct A/B tests, set rate / budget limits, etc - - -#### Example: - - +Call any model — whether hosted on Cloudflare or by a third-party provider — through the same Cloudflare API. No provider SDKs or API keys needed — authentication and billing are handled through your Cloudflare account. Three endpoints are available: `/ai/run` for all modalities, `/ai/v1/chat/completions` for OpenAI SDK compatibility, and `/ai/v1/responses` for agentic workflows. +```bash +curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ + --data '{ + "model": "openai/gpt-4.1-mini", + "messages": [{"role": "user", "content": "What is Cloudflare?"}] + }' +``` -Refer to [Unified API](/ai-gateway/usage/chat-completion/) to learn more about OpenAI compatibility. +Refer to [Workers AI REST API](/ai-gateway/usage/rest-api/) for details and examples. ### Provider-specific endpoints diff --git a/src/content/docs/ai-gateway/integrations/worker-binding-methods.mdx b/src/content/docs/ai-gateway/integrations/worker-binding-methods.mdx index 75f6e924bcfb291..4394eba6de8c158 100644 --- a/src/content/docs/ai-gateway/integrations/worker-binding-methods.mdx +++ b/src/content/docs/ai-gateway/integrations/worker-binding-methods.mdx @@ -87,7 +87,7 @@ const resp = await env.AI.run( Third-party models require an AI Gateway and use [Unified Billing](/ai-gateway/features/unified-billing/). Cloudflare manages the provider credentials and deducts credits from your account. You do not need to supply your own API keys. :::note -[BYOK (Bring Your Own Keys)](/ai-gateway/configuration/bring-your-own-keys/) is not supported for third-party models called through the AI binding. To use your own provider keys, use the [AI Gateway REST API](/ai-gateway/usage/providers/) or the [chat completions endpoint](/ai-gateway/usage/chat-completion/) instead. +[BYOK (Bring Your Own Keys)](/ai-gateway/configuration/bring-your-own-keys/) is not supported for third-party models called through the AI binding. To use your own provider keys, use the [provider-native endpoints](/ai-gateway/usage/providers/) instead. ::: Browse available models in the [model catalog](/ai/models/). diff --git a/src/content/docs/ai-gateway/observability/custom-metadata.mdx b/src/content/docs/ai-gateway/observability/custom-metadata.mdx index 002377b2c2ae53a..08abfd2215e4707 100644 --- a/src/content/docs/ai-gateway/observability/custom-metadata.mdx +++ b/src/content/docs/ai-gateway/observability/custom-metadata.mdx @@ -44,11 +44,11 @@ Objects are not supported as metadata values. To include custom metadata in your request using cURL: ```bash -curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \ - --header 'Authorization: Bearer {api_token}' \ - --header 'Content-Type: application/json' \ +curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ --header 'cf-aig-metadata: {"team": "AI", "user": 12345, "test":true}' \ - --data '{"model": "gpt-4o", "messages": [{"role": "user", "content": "What should I eat for lunch?"}]}' + --data '{"model": "openai/gpt-4.1", "messages": [{"role": "user", "content": "What should I eat for lunch?"}]}' ``` ### Using SDK @@ -60,15 +60,15 @@ import OpenAI from "openai"; export default { async fetch(request, env, ctx) { - const openai = new OpenAI({ - apiKey: env.OPENAI_API_KEY, - baseURL: "https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai", - }); - - try { - const chatCompletion = await openai.chat.completions.create( - { - model: "gpt-4o", + const openai = new OpenAI({ + apiKey: env.CLOUDFLARE_API_TOKEN, + baseURL: `https://api.cloudflare.com/client/v4/accounts/${env.CLOUDFLARE_ACCOUNT_ID}/ai/v1`, + }); + + try { + const chatCompletion = await openai.chat.completions.create( + { + model: "openai/gpt-4.1", messages: [{ role: "user", content: "What should I eat for lunch?" }], max_tokens: 50, }, diff --git a/src/content/docs/ai-gateway/observability/logging/index.mdx b/src/content/docs/ai-gateway/observability/logging/index.mdx index 44efd30f69640a9..cfcd1cab3d1200c 100644 --- a/src/content/docs/ai-gateway/observability/logging/index.mdx +++ b/src/content/docs/ai-gateway/observability/logging/index.mdx @@ -34,20 +34,19 @@ The `cf-aig-collect-log` header allows you to bypass the default log setting for In the example below, we use `cf-aig-collect-log` to bypass the default setting to avoid saving the log. ```bash -curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \ - --header "Authorization: Bearer $TOKEN" \ - --header 'Content-Type: application/json' \ - --header 'cf-aig-collect-log: false' \ - --data ' { - "model": "gpt-4o-mini", - "messages": [ - { - "role": "user", - "content": "What is the email address and phone number of user123?" - } - ] +curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ + --header "cf-aig-collect-log: false" \ + --data '{ + "model": "openai/gpt-4.1-mini", + "messages": [ + { + "role": "user", + "content": "What is the email address and phone number of user123?" } -' + ] + }' ``` ### Collect log payload (`cf-aig-collect-log-payload`) @@ -64,20 +63,19 @@ This is useful when you want to maintain visibility into usage metrics and reque In the example below, we use `cf-aig-collect-log-payload` to skip storing the request and response bodies while keeping the metadata log. ```bash -curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \ - --header "Authorization: Bearer $TOKEN" \ - --header 'Content-Type: application/json' \ - --header 'cf-aig-collect-log-payload: false' \ - --data ' { - "model": "gpt-4o-mini", - "messages": [ - { - "role": "user", - "content": "What is the email address and phone number of user123?" - } - ] +curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ + --header "cf-aig-collect-log-payload: false" \ + --data '{ + "model": "openai/gpt-4.1-mini", + "messages": [ + { + "role": "user", + "content": "What is the email address and phone number of user123?" } -' + ] + }' ``` :::note diff --git a/src/content/docs/ai-gateway/supported-models.mdx b/src/content/docs/ai-gateway/supported-models.mdx index 691c8c3f0aa86d8..b4f955ff52d808b 100644 --- a/src/content/docs/ai-gateway/supported-models.mdx +++ b/src/content/docs/ai-gateway/supported-models.mdx @@ -8,7 +8,7 @@ products: - ai-gateway --- -The following models are supported for [unified billing](/ai-gateway/features/unified-billing/) when using the AI Gateway REST API, including the [OpenAI-compatible endpoint](/ai-gateway/usage/chat-completion/) and [provider-native endpoints](/ai-gateway/usage/providers/). +The following models are supported for [unified billing](/ai-gateway/features/unified-billing/) when using the [OpenAI-compatible endpoint](/ai-gateway/usage/chat-completion/) or [provider-native endpoints](/ai-gateway/usage/providers/). For models available through the AI binding (`env.AI.run()`), refer to the [model catalog](/ai/models/). diff --git a/src/content/docs/ai-gateway/tutorials/create-first-aig-workers.mdx b/src/content/docs/ai-gateway/tutorials/create-first-aig-workers.mdx index a20e3fa908d61ed..80be5c879625b72 100644 --- a/src/content/docs/ai-gateway/tutorials/create-first-aig-workers.mdx +++ b/src/content/docs/ai-gateway/tutorials/create-first-aig-workers.mdx @@ -30,17 +30,21 @@ Then, create a new AI Gateway. 2. Select **Workers AI** as your provider to set up an endpoint specific to Workers AI. You will receive an endpoint URL for sending requests. -## Configure Your Workers AI +## Send your first request 1. Go to **AI** > **Workers AI** in the Cloudflare dashboard. 2. Select **Use REST API** and follow the steps to create and copy the API token and Account ID. -3. **Send Requests to Workers AI**: Use the provided API endpoint. For example, you can run a model via the API using a curl command. Replace `{account_id}`, `{gateway_id}` and `{cf_api_token}` with your actual account ID and API token: +3. Send a request using the [Workers AI REST API](/ai-gateway/usage/rest-api/). Replace `$CLOUDFLARE_ACCOUNT_ID` and `$CLOUDFLARE_API_TOKEN` with your actual account ID and API token: ```bash - curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/@cf/meta/llama-3.1-8b-instruct \ - --header 'Authorization: Bearer {cf_api_token}' \ - --header 'Content-Type: application/json' \ - --data '{"prompt": "What is Cloudflare?"}' + curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ + --data '{ + "model": "moonshotai/kimi-k2.6", + "provider": "cloudflare", + "messages": [{"role": "user", "content": "What is Cloudflare?"}] + }' ``` The expected output would be similar to : diff --git a/src/content/docs/ai-gateway/tutorials/deploy-aig-worker.mdx b/src/content/docs/ai-gateway/tutorials/deploy-aig-worker.mdx deleted file mode 100644 index f18c9241dceff7a..000000000000000 --- a/src/content/docs/ai-gateway/tutorials/deploy-aig-worker.mdx +++ /dev/null @@ -1,148 +0,0 @@ ---- -reviewed: 2023-09-27 -difficulty: Beginner -pcx_content_type: tutorial -tags: - - AI - - JavaScript -title: Deploy a Worker that connects to OpenAI via AI Gateway -description: Learn how to deploy a Worker that makes calls to OpenAI through AI Gateway -products: - - ai-gateway ---- - -import { Render, PackageManagers } from "~/components"; - -In this tutorial, you will learn how to deploy a Worker that makes calls to OpenAI through AI Gateway. AI Gateway helps you better observe and control your AI applications with more analytics, caching, rate limiting, and logging. - -This tutorial uses the most recent v4 OpenAI node library, an update released in August 2023. - -## Before you start - -All of the tutorials assume you have already completed the [Get started guide](/workers/get-started/guide/), which gets you set up with a Cloudflare Workers account, [C3](https://github.com/cloudflare/workers-sdk/tree/main/packages/create-cloudflare), and [Wrangler](/workers/wrangler/install-and-update/). - -## 1. Create an AI Gateway and OpenAI API key - -On the AI Gateway page in the Cloudflare dashboard, create a new AI Gateway by clicking the plus button on the top right. You should be able to name the gateway as well as the endpoint. Click on the API Endpoints button to copy the endpoint. You can choose from provider-specific endpoints such as OpenAI, HuggingFace, and Replicate. - -For this tutorial, we will be using the OpenAI provider-specific endpoint, so select OpenAI in the dropdown and copy the new endpoint. - -You will also need an OpenAI account and API key for this tutorial. If you do not have one, create a new OpenAI account and create an API key to continue with this tutorial. Make sure to store your API key somewhere safe so you can use it later. - -## 2. Create a new Worker - -Create a Worker project in the command line: - - - - - -Go to your new open Worker project: - -```sh title="Open your new project directory" -cd openai-aig -``` - -Inside of your new openai-aig directory, find and open the `src/index.js` file. You will configure this file for most of the tutorial. - -Initially, your generated `index.js` file should look like this: - -```js -export default { - async fetch(request, env, ctx) { - return new Response("Hello World!"); - }, -}; -``` - -## 3. Configure OpenAI in your Worker - -With your Worker project created, we can learn how to make your first request to OpenAI. You will use the OpenAI node library to interact with the OpenAI API. Install the OpenAI node library with `npm`: - - - -In your `src/index.js` file, add the import for `openai` above `export default`: - -```js -import OpenAI from "openai"; -``` - -Within your `fetch` function, set up the configuration and instantiate your `OpenAIApi` client with the AI Gateway endpoint you created: - -```js null {5-8} -import OpenAI from "openai"; - -export default { - async fetch(request, env, ctx) { - const openai = new OpenAI({ - apiKey: env.OPENAI_API_KEY, - baseURL: - "https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai", // paste your AI Gateway endpoint here - }); - }, -}; -``` - -To make this work, you need to use [`wrangler secret put`](/workers/wrangler/commands/general/#secret-put) to set your `OPENAI_API_KEY`. This will save the API key to your environment so your Worker can access it when deployed. This key is the API key you created earlier in the OpenAI dashboard: - - - -To make this work in local development, create a new file `.dev.vars` in your Worker project and add this line. Make sure to replace `OPENAI_API_KEY` with your own OpenAI API key: - -```txt title="Save your API key locally" -OPENAI_API_KEY = "" -``` - -## 4. Make an OpenAI request - -Now we can make a request to the OpenAI [Chat Completions API](https://platform.openai.com/docs/guides/gpt/chat-completions-api). - -You can specify what model you'd like, the role and prompt, as well as the max number of tokens you want in your total request. - -```js null {10-22} -import OpenAI from "openai"; - -export default { - async fetch(request, env, ctx) { - const openai = new OpenAI({ - apiKey: env.OPENAI_API_KEY, - baseURL: - "https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai", - }); - - try { - const chatCompletion = await openai.chat.completions.create({ - model: "gpt-4o-mini", - messages: [{ role: "user", content: "What is a neuron?" }], - max_tokens: 100, - }); - - const response = chatCompletion.choices[0].message; - - return new Response(JSON.stringify(response)); - } catch (e) { - return new Response(e); - } - }, -}; -``` - -## 5. Deploy your Worker application - -To deploy your application, run the `npx wrangler deploy` command to deploy your Worker application: - - - -You can now preview your Worker at \.\.workers.dev. - -## 6. Review your AI Gateway - -When you go to AI Gateway in your Cloudflare dashboard, you should see your recent request being logged. You can also [tweak your settings](/ai-gateway/configuration/) to manage your logs, caching, and rate limiting settings. diff --git a/src/content/docs/ai-gateway/usage/chat-completion.mdx b/src/content/docs/ai-gateway/usage/chat-completion.mdx index 1b5255427726c73..b2ec822a02f0673 100644 --- a/src/content/docs/ai-gateway/usage/chat-completion.mdx +++ b/src/content/docs/ai-gateway/usage/chat-completion.mdx @@ -6,6 +6,7 @@ tags: - AI sidebar: order: 2 + badge: Deprecated products: - ai-gateway --- @@ -17,6 +18,10 @@ import { } from "~/components"; import CodeSnippets from "~/components/ai-gateway/code-examples.astro"; +:::caution[Deprecated] +This endpoint is deprecated. Use the [Workers AI REST API](/ai-gateway/usage/rest-api/) instead, which provides OpenAI-compatible endpoints at `api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/v1/chat/completions`. The `/compat/chat/completions` endpoint will continue to work for existing integrations. +::: + Cloudflare's AI Gateway offers an OpenAI-compatible `/chat/completions` endpoint, enabling integration with multiple AI providers using a single URL. This feature simplifies the integration process, allowing for seamless switching between different models without significant code modifications. ## Endpoint URL diff --git a/src/content/docs/ai-gateway/usage/providers/workersai.mdx b/src/content/docs/ai-gateway/usage/providers/workersai.mdx index 8853e20c56547ed..f6f19f34982afef 100644 --- a/src/content/docs/ai-gateway/usage/providers/workersai.mdx +++ b/src/content/docs/ai-gateway/usage/providers/workersai.mdx @@ -12,82 +12,31 @@ products: import { Render, TypeScriptExample } from "~/components"; -Use AI Gateway for analytics, caching, and security on requests to [Workers AI](/workers-ai/). Workers AI integrates seamlessly with AI Gateway, allowing you to execute AI inference via API requests or through an environment binding for Workers scripts. The binding simplifies the process by routing requests through your AI Gateway with minimal setup. - -:::note -You can also access third-party models through AI Gateway using the AI binding. Refer to the [binding reference](/ai-gateway/integrations/worker-binding-methods/#envairun) for details. -::: - -## Prerequisites - -When making requests to Workers AI, ensure you have the following: - -- Your AI Gateway Account ID. -- Your AI Gateway gateway name. -- An active Workers AI API token. -- The name of the Workers AI model you want to use. +Use AI Gateway for analytics, caching, and security on requests to [Workers AI](/workers-ai/). ## REST API -To interact with a REST API, update the URL used for your request: - -- **Previous**: - -```txt -https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run/{model_id} +Use the [Workers AI REST API](/ai-gateway/usage/rest-api/) to call Workers AI models. Requests are automatically routed through your account's default AI Gateway. + +```bash title="Request to Workers AI Kimi model" +curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ + --data '{ + "model": "moonshotai/kimi-k2.6", + "provider": "cloudflare", + "messages": [ + { + "role": "user", + "content": "What is Cloudflare?" + } + ] + }' ``` -- **New**: +To use a specific gateway, add the `cf-aig-gateway-id` header to your request. -```txt -https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/{model_id} -``` - -For these parameters: - -- `{account_id}` is your Cloudflare [account ID](/workers-ai/get-started/rest-api/#1-get-api-token-and-account-id). -- `{gateway_id}` refers to the name of your existing [AI Gateway](/ai-gateway/get-started/). -- `{model_id}` refers to the model ID of the [Workers AI model](/workers-ai/models/). - -## Examples - -First, generate an [API token](/fundamentals/api/get-started/create-token/) with `Workers AI Read` access and use it in your request. - -```bash title="Request to Workers AI llama model" -curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/@cf/meta/llama-3.1-8b-instruct \ - --header 'Authorization: Bearer {cf_api_token}' \ - --header 'Content-Type: application/json' \ - --data '{"prompt": "What is Cloudflare?"}' -``` - -```bash title="Request to Workers AI text classification model" -curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/@cf/huggingface/distilbert-sst-2-int8 \ - --header 'Authorization: Bearer {cf_api_token}' \ - --header 'Content-Type: application/json' \ - --data '{ "text": "Cloudflare docs are amazing!" }' -``` - -### OpenAI compatible endpoints - -
- -```bash title="Request to OpenAI compatible endpoint" -curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/v1/chat/completions \ - --header 'Authorization: Bearer {cf_api_token}' \ - --header 'Content-Type: application/json' \ - --data '{ - "model": "@cf/meta/llama-3.1-8b-instruct", - "messages": [ - { - "role": "user", - "content": "What is Cloudflare?" - } - ] - } -' -``` - -## Workers Binding +## Workers binding You can integrate Workers AI with AI Gateway using an environment binding. To include an AI Gateway within your Worker, add the gateway as an object in your Workers AI request. @@ -120,7 +69,7 @@ export default { -For a detailed step-by-step guide on integrating Workers AI with AI Gateway using a binding, see [Integrations in AI Gateway](/ai-gateway/integrations/aig-workers-ai-binding/). +For a detailed step-by-step guide on integrating Workers AI with AI Gateway using a binding, refer to [Integrations in AI Gateway](/ai-gateway/integrations/aig-workers-ai-binding/). Workers AI supports the following parameters for AI gateways: @@ -130,17 +79,3 @@ Workers AI supports the following parameters for AI gateways: - Controls whether the request should [skip the cache](/ai-gateway/features/caching/#skip-cache-cf-aig-skip-cache). - `cacheTtl` number - Controls the [Cache TTL](/ai-gateway/features/caching/#cache-ttl-cf-aig-cache-ttl). - - diff --git a/src/content/docs/ai-gateway/usage/rest-api.mdx b/src/content/docs/ai-gateway/usage/rest-api.mdx new file mode 100644 index 000000000000000..550a7c726b0ac44 --- /dev/null +++ b/src/content/docs/ai-gateway/usage/rest-api.mdx @@ -0,0 +1,239 @@ +--- +title: Workers AI REST API +pcx_content_type: how-to +description: Call third-party and Workers AI models through the Cloudflare API with AI Gateway features like logging, caching, and rate limiting. +sidebar: + order: 1 +tags: + - AI +products: + - ai-gateway +--- + +The Workers AI REST API lets you call any model — whether hosted on Cloudflare or by a third-party provider like OpenAI, Anthropic, or Google — through the same Cloudflare API, with all AI Gateway features — logging, caching, rate limiting, and more — applied automatically. + +No provider SDKs or API keys are needed. Authentication and billing are handled through your Cloudflare account. Third-party models are billed via [Unified Billing](/ai-gateway/features/unified-billing/), while Workers AI models follow [Workers AI pricing](/workers-ai/platform/pricing/). + +## Endpoints + +Three endpoints are available, each suited to different use cases: + +| Endpoint | Format | Use case | +| -------- | ------ | -------- | +| `POST /ai/run` | Envelope with `model`, `provider`, `input` | All models and modalities (LLM, image, TTS, ASR) | +| `POST /ai/v1/chat/completions` | OpenAI chat completions | LLMs — OpenAI SDK compatible | +| `POST /ai/v1/responses` | OpenAI Responses API | Agentic workflows — OpenAI SDK compatible | + +## Authentication + +Authenticate with a [Cloudflare API token](/fundamentals/api/get-started/create-token/) that has `AI Gateway` permission. Pass it in the `Authorization` header. + +:::note +Ensure your Cloudflare account has [sufficient credits loaded](/ai-gateway/features/unified-billing/#load-credits) before calling third-party models. +::: + +## Model naming + +Models use the `author/model` format: + +- `openai/gpt-4.1` — OpenAI +- `anthropic/claude-sonnet-4` — Anthropic +- `google-ai-studio/gemini-2.5-flash` — Google AI Studio +- `meta/llama-3.3-70b-instruct-fp8-fast` — Workers AI (Llama) +- `xai/grok-3` — xAI + +Browse available models in the [model catalog](/ai/models/). + +## Provider + +The optional `provider` field lets you specify which provider should serve the request. For third-party models in the catalog today, each model has a single provider, so you can omit this field. + +Set `provider` to `"cloudflare"` to run a model on Workers AI: + +```json +{ + "model": "moonshotai/kimi-k2.6", + "provider": "cloudflare", + "messages": [...] +} +``` + +When omitted, Cloudflare routes the request to the preferred provider for that model. + +## `/ai/run` — universal endpoint + +Accepts any model with its per-model schema. Model-specific parameters go inside `input`. + +```bash +curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ + --data '{ + "model": "openai/gpt-4.1", + "input": { + "messages": [ + { + "role": "user", + "content": "What is Cloudflare?" + } + ], + "max_tokens": 512 + } + }' +``` + +### Call a Workers AI model + +```bash +curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ + --data '{ + "model": "moonshotai/kimi-k2.6", + "provider": "cloudflare", + "input": { + "messages": [ + { + "role": "user", + "content": "What is Cloudflare?" + } + ] + } + }' +``` + +The existing Workers AI endpoint with the model ID in the URL path also continues to work: + +```bash +curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run/@cf/moonshotai/kimi-k2.6" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ + --data '{ + "messages": [ + { + "role": "user", + "content": "What is Cloudflare?" + } + ] + }' +``` + +## `/ai/v1/chat/completions` — OpenAI compatible + +Uses the standard OpenAI chat completions format. The `model` field uses the same `author/model` naming. This endpoint is compatible with the OpenAI SDK and other OpenAI-compatible clients. + +```bash +curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ + --data '{ + "model": "openai/gpt-4.1", + "messages": [ + { + "role": "system", + "content": "You are a helpful assistant." + }, + { + "role": "user", + "content": "What is Cloudflare?" + } + ], + "max_tokens": 512, + "temperature": 0.7, + "stream": true + }' +``` + +### OpenAI SDK + +Point the OpenAI SDK `baseURL` at the Cloudflare API: + +```javascript +import OpenAI from "openai"; + +const openai = new OpenAI({ + apiKey: CLOUDFLARE_API_TOKEN, + baseURL: `https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/ai/v1`, +}); + +const response = await openai.chat.completions.create({ + model: "openai/gpt-4.1", + messages: [{ role: "user", content: "What is Cloudflare?" }], +}); +``` + +## `/ai/v1/responses` — OpenAI Responses API + +Uses the OpenAI Responses API format for agentic workflows. Compatible with the OpenAI SDK. + +```javascript +import OpenAI from "openai"; + +const openai = new OpenAI({ + apiKey: CLOUDFLARE_API_TOKEN, + baseURL: `https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/ai/v1`, +}); + +const response = await openai.responses.create({ + model: "openai/gpt-4.1", + input: "What is Cloudflare?", +}); +``` + +## Specify a gateway + +By default, third-party model requests route through your account's default AI Gateway. To use a specific gateway, include the `cf-aig-gateway-id` header: + +```bash +curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \ + --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "cf-aig-gateway-id: my-gateway" \ + --header "Content-Type: application/json" \ + --data '{ + "model": "anthropic/claude-sonnet-4", + "messages": [ + { + "role": "user", + "content": "Hello" + } + ] + }' +``` + +With the OpenAI SDK, set the header via `defaultHeaders`: + +```javascript +const openai = new OpenAI({ + apiKey: CLOUDFLARE_API_TOKEN, + baseURL: `https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/ai/v1`, + defaultHeaders: { + "cf-aig-gateway-id": "my-gateway", + }, +}); +``` + +All AI Gateway features configured on that gateway — caching, rate limiting, guardrails, and logging — apply to the request. + +## Per-request configuration + +Use `cf-aig-*` headers to control AI Gateway behavior on a per-request basis: + +| Header | Type | Description | +| ------ | ---- | ----------- | +| `cf-aig-skip-cache` | boolean | Skip the cache for this request. | +| `cf-aig-cache-ttl` | number | Cache TTL in seconds. | +| `cf-aig-cache-key` | string | Custom cache key. | +| `cf-aig-collect-log` | boolean | Turn logging on or off for this request. | +| `cf-aig-request-timeout` | number | Request timeout in milliseconds. | +| `cf-aig-max-attempts` | number | Retry attempts (max 5). | +| `cf-aig-retry-delay` | number | Retry delay in milliseconds (max 5000). | +| `cf-aig-backoff` | string | Backoff method: `constant`, `linear`, or `exponential`. | +| `cf-aig-metadata` | JSON string | Custom metadata to attach to the log entry. | + +For more details on these options, refer to [Request handling](/ai-gateway/configuration/request-handling/) and [Caching](/ai-gateway/features/caching/). + +## Related resources + +- [Unified Billing](/ai-gateway/features/unified-billing/) — load credits and manage spend limits. +- [Workers AI binding](/ai-gateway/integrations/worker-binding-methods/) — call models from within a Cloudflare Worker using `env.AI.run()`. +- [Model catalog](/ai/models/) — browse models supported by the Workers AI REST API. diff --git a/src/content/partials/ai-gateway/chat-completions-providers.mdx b/src/content/partials/ai-gateway/chat-completions-providers.mdx index e7da9357f4b8ab9..8078d22adbe0b8a 100644 --- a/src/content/partials/ai-gateway/chat-completions-providers.mdx +++ b/src/content/partials/ai-gateway/chat-completions-providers.mdx @@ -8,10 +8,10 @@ import { Code } from "~/components"; ## OpenAI-Compatible Endpoint -You can also use the [OpenAI-compatible endpoint](/ai-gateway/usage/chat-completion/) (`/ai-gateway/usage/chat-completion/`) to access {props.name} models using the OpenAI API schema. To do so, send your requests to: +You can also access {props.name} models using the OpenAI API schema through the [Workers AI REST API](/ai-gateway/usage/rest-api/). Send your requests to: ```txt -https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat/chat/completions +https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/v1/chat/completions ``` Specify: From d7d0208c81ffdb50a4ac293ed04eb2a4dfbd92b6 Mon Sep 17 00:00:00 2001 From: Ming Lu Date: Thu, 14 May 2026 12:29:49 -0700 Subject: [PATCH 2/5] [AI Gateway] Fix auth docs conflict and wrap Worker example in TypeScriptExample --- .../configuration/manage-gateway.mdx | 2 +- .../observability/custom-metadata.mdx | 70 ++++++++++--------- 2 files changed, 39 insertions(+), 33 deletions(-) diff --git a/src/content/docs/ai-gateway/configuration/manage-gateway.mdx b/src/content/docs/ai-gateway/configuration/manage-gateway.mdx index 69edd04eb8e13a4..ad27416a07bf268 100644 --- a/src/content/docs/ai-gateway/configuration/manage-gateway.mdx +++ b/src/content/docs/ai-gateway/configuration/manage-gateway.mdx @@ -20,7 +20,7 @@ AI Gateway can automatically create a gateway for you. If you omit the gateway I This means you can start sending requests without creating a gateway first — AI Gateway handles gateway creation for you. -The request that triggers auto-creation must include a valid `cf-aig-authorization` header. An unauthenticated request to a `default` gateway that does not yet exist does not create the gateway. For Workers AI bindings, the account identity from the binding is used instead of the header. +The request that triggers auto-creation must be authenticated. When using the [Workers AI REST API](/ai-gateway/usage/rest-api/), the standard `Authorization` header is sufficient. When using [provider-native endpoints](/ai-gateway/usage/providers/) at `gateway.ai.cloudflare.com`, include a valid `cf-aig-authorization` header. For Workers AI bindings, the account identity from the binding is used instead of a header. The auto-created default gateway uses the following settings: diff --git a/src/content/docs/ai-gateway/observability/custom-metadata.mdx b/src/content/docs/ai-gateway/observability/custom-metadata.mdx index 08abfd2215e4707..9d8708b8ac0c440 100644 --- a/src/content/docs/ai-gateway/observability/custom-metadata.mdx +++ b/src/content/docs/ai-gateway/observability/custom-metadata.mdx @@ -8,6 +8,8 @@ products: - ai-gateway --- +import { TypeScriptExample } from "~/components"; + Custom metadata in AI Gateway allows you to tag requests with user IDs or other identifiers, enabling better tracking and analysis of your requests. Metadata values can be strings, numbers, or booleans, and will appear in your logs, making it easy to search and filter through your data. ## Key Features @@ -55,44 +57,48 @@ curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ To include custom metadata in your request using the OpenAI SDK: -```javascript + + +```ts import OpenAI from "openai"; export default { - async fetch(request, env, ctx) { - const openai = new OpenAI({ - apiKey: env.CLOUDFLARE_API_TOKEN, - baseURL: `https://api.cloudflare.com/client/v4/accounts/${env.CLOUDFLARE_ACCOUNT_ID}/ai/v1`, - }); - - try { - const chatCompletion = await openai.chat.completions.create( - { - model: "openai/gpt-4.1", - messages: [{ role: "user", content: "What should I eat for lunch?" }], - max_tokens: 50, - }, - { - headers: { - "cf-aig-metadata": JSON.stringify({ - user: "JaneDoe", - team: 12345, - test: true - }), - }, - } - ); - - const response = chatCompletion.choices[0].message; - return new Response(JSON.stringify(response)); - } catch (e) { - console.log(e); - return new Response(e); - } - }, + async fetch(request, env, ctx) { + const openai = new OpenAI({ + apiKey: env.CLOUDFLARE_API_TOKEN, + baseURL: `https://api.cloudflare.com/client/v4/accounts/${env.CLOUDFLARE_ACCOUNT_ID}/ai/v1`, + }); + + try { + const chatCompletion = await openai.chat.completions.create( + { + model: "openai/gpt-4.1", + messages: [{ role: "user", content: "What should I eat for lunch?" }], + max_tokens: 50, + }, + { + headers: { + "cf-aig-metadata": JSON.stringify({ + user: "JaneDoe", + team: 12345, + test: true, + }), + }, + }, + ); + + const response = chatCompletion.choices[0].message; + return new Response(JSON.stringify(response)); + } catch (e) { + console.log(e); + return new Response(e); + } + }, }; ``` + + ### Using Binding To include custom metadata in your request using [Bindings](/workers/runtime-apis/bindings/): From 08760ccc39e591634d511bb38335fc70096655ef Mon Sep 17 00:00:00 2001 From: Ming Lu Date: Thu, 14 May 2026 12:56:40 -0700 Subject: [PATCH 3/5] [AI Gateway] Rename Workers AI REST API to REST API across all pages --- .../docs/ai-gateway/configuration/authentication.mdx | 2 +- .../docs/ai-gateway/configuration/manage-gateway.mdx | 2 +- src/content/docs/ai-gateway/features/unified-billing.mdx | 4 ++-- src/content/docs/ai-gateway/get-started.mdx | 4 ++-- .../docs/ai-gateway/tutorials/create-first-aig-workers.mdx | 2 +- src/content/docs/ai-gateway/usage/chat-completion.mdx | 2 +- src/content/docs/ai-gateway/usage/providers/workersai.mdx | 2 +- src/content/docs/ai-gateway/usage/rest-api.mdx | 6 +++--- .../partials/ai-gateway/chat-completions-providers.mdx | 2 +- 9 files changed, 13 insertions(+), 13 deletions(-) diff --git a/src/content/docs/ai-gateway/configuration/authentication.mdx b/src/content/docs/ai-gateway/configuration/authentication.mdx index c09a871a18cd11f..cf2e8bd5044c056 100644 --- a/src/content/docs/ai-gateway/configuration/authentication.mdx +++ b/src/content/docs/ai-gateway/configuration/authentication.mdx @@ -11,7 +11,7 @@ products: AI Gateway requires a valid Cloudflare API token for each request. This prevents unauthorized access and protects against invalid requests that can inflate log storage usage. -When using the [Workers AI REST API](/ai-gateway/usage/rest-api/), pass your Cloudflare API token in the standard `Authorization` header. When using [provider-native endpoints](/ai-gateway/usage/providers/) at `gateway.ai.cloudflare.com`, use the `cf-aig-authorization` header instead. +When using the [REST API](/ai-gateway/usage/rest-api/), pass your Cloudflare API token in the standard `Authorization` header. When using [provider-native endpoints](/ai-gateway/usage/providers/) at `gateway.ai.cloudflare.com`, use the `cf-aig-authorization` header instead. ## Setting up Authenticated Gateway using the dashboard diff --git a/src/content/docs/ai-gateway/configuration/manage-gateway.mdx b/src/content/docs/ai-gateway/configuration/manage-gateway.mdx index ad27416a07bf268..2f122e4e19f57ea 100644 --- a/src/content/docs/ai-gateway/configuration/manage-gateway.mdx +++ b/src/content/docs/ai-gateway/configuration/manage-gateway.mdx @@ -20,7 +20,7 @@ AI Gateway can automatically create a gateway for you. If you omit the gateway I This means you can start sending requests without creating a gateway first — AI Gateway handles gateway creation for you. -The request that triggers auto-creation must be authenticated. When using the [Workers AI REST API](/ai-gateway/usage/rest-api/), the standard `Authorization` header is sufficient. When using [provider-native endpoints](/ai-gateway/usage/providers/) at `gateway.ai.cloudflare.com`, include a valid `cf-aig-authorization` header. For Workers AI bindings, the account identity from the binding is used instead of a header. +The request that triggers auto-creation must be authenticated. When using the [REST API](/ai-gateway/usage/rest-api/), the standard `Authorization` header is sufficient. When using [provider-native endpoints](/ai-gateway/usage/providers/) at `gateway.ai.cloudflare.com`, include a valid `cf-aig-authorization` header. For Workers AI bindings, the account identity from the binding is used instead of a header. The auto-created default gateway uses the following settings: diff --git a/src/content/docs/ai-gateway/features/unified-billing.mdx b/src/content/docs/ai-gateway/features/unified-billing.mdx index 05189a24d212f92..138f9fed9973fb3 100644 --- a/src/content/docs/ai-gateway/features/unified-billing.mdx +++ b/src/content/docs/ai-gateway/features/unified-billing.mdx @@ -79,7 +79,7 @@ Refer to the [binding reference](/ai-gateway/integrations/worker-binding-methods Call a supported provider through the AI Gateway REST API without passing a provider API key. -#### Workers AI REST API +#### REST API Use the Cloudflare API to call third-party models. Pass your Cloudflare API token in the `Authorization` header: @@ -93,7 +93,7 @@ curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ }' ``` -Refer to [Workers AI REST API](/ai-gateway/usage/rest-api/) for more details on all available endpoints. +Refer to [REST API](/ai-gateway/usage/rest-api/) for more details on all available endpoints. #### AI Gateway provider-native endpoints diff --git a/src/content/docs/ai-gateway/get-started.mdx b/src/content/docs/ai-gateway/get-started.mdx index d69e266dfd5348b..5fabb25be70a1da 100644 --- a/src/content/docs/ai-gateway/get-started.mdx +++ b/src/content/docs/ai-gateway/get-started.mdx @@ -69,7 +69,7 @@ Authenticate with your upstream AI provider using one of the following options: ## Integration options -### Workers AI REST API +### REST API Call any model — whether hosted on Cloudflare or by a third-party provider — through the same Cloudflare API. No provider SDKs or API keys needed — authentication and billing are handled through your Cloudflare account. Three endpoints are available: `/ai/run` for all modalities, `/ai/v1/chat/completions` for OpenAI SDK compatibility, and `/ai/v1/responses` for agentic workflows. @@ -83,7 +83,7 @@ curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ }' ``` -Refer to [Workers AI REST API](/ai-gateway/usage/rest-api/) for details and examples. +Refer to [REST API](/ai-gateway/usage/rest-api/) for details and examples. ### Provider-specific endpoints diff --git a/src/content/docs/ai-gateway/tutorials/create-first-aig-workers.mdx b/src/content/docs/ai-gateway/tutorials/create-first-aig-workers.mdx index 80be5c879625b72..c39a16b6ae1183b 100644 --- a/src/content/docs/ai-gateway/tutorials/create-first-aig-workers.mdx +++ b/src/content/docs/ai-gateway/tutorials/create-first-aig-workers.mdx @@ -34,7 +34,7 @@ Then, create a new AI Gateway. 1. Go to **AI** > **Workers AI** in the Cloudflare dashboard. 2. Select **Use REST API** and follow the steps to create and copy the API token and Account ID. -3. Send a request using the [Workers AI REST API](/ai-gateway/usage/rest-api/). Replace `$CLOUDFLARE_ACCOUNT_ID` and `$CLOUDFLARE_API_TOKEN` with your actual account ID and API token: +3. Send a request using the [REST API](/ai-gateway/usage/rest-api/). Replace `$CLOUDFLARE_ACCOUNT_ID` and `$CLOUDFLARE_API_TOKEN` with your actual account ID and API token: ```bash curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \ diff --git a/src/content/docs/ai-gateway/usage/chat-completion.mdx b/src/content/docs/ai-gateway/usage/chat-completion.mdx index b2ec822a02f0673..1280237dc0187bb 100644 --- a/src/content/docs/ai-gateway/usage/chat-completion.mdx +++ b/src/content/docs/ai-gateway/usage/chat-completion.mdx @@ -19,7 +19,7 @@ import { import CodeSnippets from "~/components/ai-gateway/code-examples.astro"; :::caution[Deprecated] -This endpoint is deprecated. Use the [Workers AI REST API](/ai-gateway/usage/rest-api/) instead, which provides OpenAI-compatible endpoints at `api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/v1/chat/completions`. The `/compat/chat/completions` endpoint will continue to work for existing integrations. +This endpoint is deprecated. Use the [REST API](/ai-gateway/usage/rest-api/) instead, which provides OpenAI-compatible endpoints at `api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/v1/chat/completions`. The `/compat/chat/completions` endpoint will continue to work for existing integrations. ::: Cloudflare's AI Gateway offers an OpenAI-compatible `/chat/completions` endpoint, enabling integration with multiple AI providers using a single URL. This feature simplifies the integration process, allowing for seamless switching between different models without significant code modifications. diff --git a/src/content/docs/ai-gateway/usage/providers/workersai.mdx b/src/content/docs/ai-gateway/usage/providers/workersai.mdx index f6f19f34982afef..7c012d45c7f0e0d 100644 --- a/src/content/docs/ai-gateway/usage/providers/workersai.mdx +++ b/src/content/docs/ai-gateway/usage/providers/workersai.mdx @@ -16,7 +16,7 @@ Use AI Gateway for analytics, caching, and security on requests to [Workers AI]( ## REST API -Use the [Workers AI REST API](/ai-gateway/usage/rest-api/) to call Workers AI models. Requests are automatically routed through your account's default AI Gateway. +Use the [REST API](/ai-gateway/usage/rest-api/) to call Workers AI models. Requests are automatically routed through your account's default AI Gateway. ```bash title="Request to Workers AI Kimi model" curl -X POST "https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/v1/chat/completions" \ diff --git a/src/content/docs/ai-gateway/usage/rest-api.mdx b/src/content/docs/ai-gateway/usage/rest-api.mdx index 550a7c726b0ac44..9449b1c295ce496 100644 --- a/src/content/docs/ai-gateway/usage/rest-api.mdx +++ b/src/content/docs/ai-gateway/usage/rest-api.mdx @@ -1,5 +1,5 @@ --- -title: Workers AI REST API +title: REST API pcx_content_type: how-to description: Call third-party and Workers AI models through the Cloudflare API with AI Gateway features like logging, caching, and rate limiting. sidebar: @@ -10,7 +10,7 @@ products: - ai-gateway --- -The Workers AI REST API lets you call any model — whether hosted on Cloudflare or by a third-party provider like OpenAI, Anthropic, or Google — through the same Cloudflare API, with all AI Gateway features — logging, caching, rate limiting, and more — applied automatically. +The REST API lets you call any model — whether hosted on Cloudflare or by a third-party provider like OpenAI, Anthropic, or Google — through the same Cloudflare API, with all AI Gateway features — logging, caching, rate limiting, and more — applied automatically. No provider SDKs or API keys are needed. Authentication and billing are handled through your Cloudflare account. Third-party models are billed via [Unified Billing](/ai-gateway/features/unified-billing/), while Workers AI models follow [Workers AI pricing](/workers-ai/platform/pricing/). @@ -236,4 +236,4 @@ For more details on these options, refer to [Request handling](/ai-gateway/confi - [Unified Billing](/ai-gateway/features/unified-billing/) — load credits and manage spend limits. - [Workers AI binding](/ai-gateway/integrations/worker-binding-methods/) — call models from within a Cloudflare Worker using `env.AI.run()`. -- [Model catalog](/ai/models/) — browse models supported by the Workers AI REST API. +- [Model catalog](/ai/models/) — browse models supported by the REST API. diff --git a/src/content/partials/ai-gateway/chat-completions-providers.mdx b/src/content/partials/ai-gateway/chat-completions-providers.mdx index 8078d22adbe0b8a..b3caee0e706cbf9 100644 --- a/src/content/partials/ai-gateway/chat-completions-providers.mdx +++ b/src/content/partials/ai-gateway/chat-completions-providers.mdx @@ -8,7 +8,7 @@ import { Code } from "~/components"; ## OpenAI-Compatible Endpoint -You can also access {props.name} models using the OpenAI API schema through the [Workers AI REST API](/ai-gateway/usage/rest-api/). Send your requests to: +You can also access {props.name} models using the OpenAI API schema through the [REST API](/ai-gateway/usage/rest-api/). Send your requests to: ```txt https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/v1/chat/completions From e2a3e1b34aaf2a2a52aab76f4649b3611691eea7 Mon Sep 17 00:00:00 2001 From: Ming Lu Date: Thu, 14 May 2026 16:54:33 -0700 Subject: [PATCH 4/5] [AI Gateway] Fix PR review feedback: auth steps, model name, provider ordering --- .../docs/ai-gateway/configuration/authentication.mdx | 4 +++- src/content/docs/ai-gateway/usage/rest-api.mdx | 6 ++---- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/src/content/docs/ai-gateway/configuration/authentication.mdx b/src/content/docs/ai-gateway/configuration/authentication.mdx index cf2e8bd5044c056..95d50b2ed4dd679 100644 --- a/src/content/docs/ai-gateway/configuration/authentication.mdx +++ b/src/content/docs/ai-gateway/configuration/authentication.mdx @@ -17,7 +17,9 @@ When using the [REST API](/ai-gateway/usage/rest-api/), pass your Cloudflare API 1. Go to the Settings for the specific gateway you want to enable authentication for. 2. Select **Create authentication token** to generate a custom token with the required `Run` permissions. Be sure to securely save this token, as it will not be displayed again. -3. Include the `cf-aig-authorization` header with your API token in each request for this gateway. +3. Include the API token in each request: + - If using the REST API (`/ai/run`), include your Cloudflare API token in the standard `Authorization` header. + - If using [provider-native endpoints](/ai-gateway/usage/providers/) at `gateway.ai.cloudflare.com`, use the `cf-aig-authorization` header. 4. Return to the settings page and toggle on Authenticated Gateway. ## Example requests diff --git a/src/content/docs/ai-gateway/usage/rest-api.mdx b/src/content/docs/ai-gateway/usage/rest-api.mdx index 9449b1c295ce496..a55c6f4cb4ea7d4 100644 --- a/src/content/docs/ai-gateway/usage/rest-api.mdx +++ b/src/content/docs/ai-gateway/usage/rest-api.mdx @@ -39,7 +39,7 @@ Models use the `author/model` format: - `openai/gpt-4.1` — OpenAI - `anthropic/claude-sonnet-4` — Anthropic - `google-ai-studio/gemini-2.5-flash` — Google AI Studio -- `meta/llama-3.3-70b-instruct-fp8-fast` — Workers AI (Llama) +- `moonshotai/kimi-k2.6` — Workers AI (Kimi) - `xai/grok-3` — xAI Browse available models in the [model catalog](/ai/models/). @@ -48,7 +48,7 @@ Browse available models in the [model catalog](/ai/models/). The optional `provider` field lets you specify which provider should serve the request. For third-party models in the catalog today, each model has a single provider, so you can omit this field. -Set `provider` to `"cloudflare"` to run a model on Workers AI: +When omitted, Cloudflare routes the request to the preferred provider for that model. Set `provider` to `"cloudflare"` to run a model on Workers AI: ```json { @@ -58,8 +58,6 @@ Set `provider` to `"cloudflare"` to run a model on Workers AI: } ``` -When omitted, Cloudflare routes the request to the preferred provider for that model. - ## `/ai/run` — universal endpoint Accepts any model with its per-model schema. Model-specific parameters go inside `input`. From 104dc3bfa9da4813c60171eb9cd6950f33a265d2 Mon Sep 17 00:00:00 2001 From: Ming Lu Date: Fri, 15 May 2026 10:02:52 -0700 Subject: [PATCH 5/5] [AI Gateway] Fix review feedback: auth migration note, Google model naming --- src/content/docs/ai-gateway/configuration/authentication.mdx | 4 ++++ src/content/docs/ai-gateway/usage/rest-api.mdx | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/src/content/docs/ai-gateway/configuration/authentication.mdx b/src/content/docs/ai-gateway/configuration/authentication.mdx index 95d50b2ed4dd679..780ade1a3e222cc 100644 --- a/src/content/docs/ai-gateway/configuration/authentication.mdx +++ b/src/content/docs/ai-gateway/configuration/authentication.mdx @@ -13,6 +13,10 @@ AI Gateway requires a valid Cloudflare API token for each request. This prevents When using the [REST API](/ai-gateway/usage/rest-api/), pass your Cloudflare API token in the standard `Authorization` header. When using [provider-native endpoints](/ai-gateway/usage/providers/) at `gateway.ai.cloudflare.com`, use the `cf-aig-authorization` header instead. +:::note +The `cf-aig-authorization` header is used with the `gateway.ai.cloudflare.com` endpoints, which continue to work. For new integrations, we recommend using the [REST API](/ai-gateway/usage/rest-api/) at `api.cloudflare.com`, which uses the standard `Authorization` header. +::: + ## Setting up Authenticated Gateway using the dashboard 1. Go to the Settings for the specific gateway you want to enable authentication for. diff --git a/src/content/docs/ai-gateway/usage/rest-api.mdx b/src/content/docs/ai-gateway/usage/rest-api.mdx index a55c6f4cb4ea7d4..c948966cfa84708 100644 --- a/src/content/docs/ai-gateway/usage/rest-api.mdx +++ b/src/content/docs/ai-gateway/usage/rest-api.mdx @@ -38,7 +38,7 @@ Models use the `author/model` format: - `openai/gpt-4.1` — OpenAI - `anthropic/claude-sonnet-4` — Anthropic -- `google-ai-studio/gemini-2.5-flash` — Google AI Studio +- `google/gemini-3-flash` — Google - `moonshotai/kimi-k2.6` — Workers AI (Kimi) - `xai/grok-3` — xAI