Skip to content
24 changes: 11 additions & 13 deletions features/agentic-compression.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -91,8 +91,8 @@ Enable compression for specific requests using the SDK or headers:
"messages": [
{"role": "user", "content": "Your prompt here"}
],
"enable_compression": true,
"compression_rate": 0.8 // Target ratio: compressed = 80% of original (optional)
"compression_model": "agentic",
"compression_configuration": { "rate": 0.8 } // Target ratio: compressed = 80% of original (optional)
}
});
```
Expand All @@ -106,8 +106,8 @@ Enable compression for specific requests using the SDK or headers:
"messages": [
{"role": "user", "content": "Your prompt here"}
],
"enable_compression": True,
"compression_rate": 0.8 # Target ratio: compressed = 80% of original (optional)
"compression_model": "agentic",
"compression_configuration": {"rate": 0.8} # Target ratio: compressed = 80% of original (optional)
}
)
```
Expand All @@ -119,17 +119,17 @@ Enable compression for specific requests using the SDK or headers:
Messages: []edgee.Message{
{Role: "user", Content: "Your prompt here"},
},
EnableCompression: true,
CompressionRate: 0.8, // Target ratio: compressed = 80% of original (optional)
CompressionModel: "agentic",
CompressionConfiguration: map[string]interface{}{"rate": 0.8}, // optional
})
```
</Tab>

<Tab title="Rust">
```rust
let input = InputObject::new(vec![Message::user("Your prompt here")])
.with_compression(true)
.with_compression_rate(0.8); // Target ratio: compressed = 80% of original (optional)
.with_compression_model("agentic")
.with_compression_configuration(CompressionConfiguration { rate: Some(0.8), semantic_preservation_threshold: None }); // optional

let response = client.send("gpt-5.2", input).await?;
```
Expand All @@ -140,9 +140,7 @@ Enable compression for specific requests using the SDK or headers:
curl -X POST "https://api.edgee.ai/v1/chat/completions" \
-H "Authorization: Bearer $EDGEE_API_KEY" \
-H "Content-Type: application/json" \
-H "x-edgee-enable-compression: true" \
-H "x-edgee-compression-rate: 0.8" \
-d '{"model": "gpt-5.2", "messages": [{"role": "user", "content": "Your prompt here"}]}'
-d '{"model": "gpt-5.2", "messages": [{"role": "user", "content": "Your prompt here"}], "compression_model": "agentic", "compression_configuration": {"rate": 0.8}}'
```
</Tab>
</Tabs>
Expand Down Expand Up @@ -212,8 +210,8 @@ const documents = [
const response = await edgee.send({
model: 'gpt-5.2',
input: `Answer the question based on these documents:\n\n${documents.join('\n\n')}\n\nQuestion: What is the main topic?`,
enable_compression: true, // Enable compression for this request
compression_rate: 0.8, // Target ratio (0-1): 0.8 = compressed is 80% of original
compression_model: "agentic",
compression_configuration: { rate: 0.8 }, // Target ratio (0-1): 0.8 = compressed is 80% of original
});

console.log(response.text);
Expand Down
1 change: 1 addition & 0 deletions features/claude-compression.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ In the **Edge Models** section of your console, find the **Claude Token Compress
There are no compression ratio or BERT threshold settings. Claude Token Compression is fully lossless by design. Just enable it and go.
</Note>


## Integration with Claude Code


Expand Down
2 changes: 1 addition & 1 deletion features/observability.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -213,7 +213,7 @@ See exactly how much token compression is saving you on every request:
const response = await edgee.send({
model: 'gpt-5.2',
input: 'Long prompt with lots of context...',
enable_compression: true,
compression_model: "agentic",
});

// Compression details
Expand Down
10 changes: 6 additions & 4 deletions sdk/go/send.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,8 @@ When `input` is an `InputObject`, you have full control over the conversation:
| `Tools` | `[]Tool` | Array of function tools available to the model |
| `ToolChoice` | `any` | Controls which tool (if any) the model should call. Can be `string` (`"auto"`, `"none"`) or `map[string]interface{}`. See [Tools documentation](/sdk/go/tools) for details |
| `Tags` | `[]string` | Optional tags to categorize and label the request for analytics and filtering. Can also be sent via the `x-edgee-tags` header (comma-separated) |
| `EnableCompression` | `bool` | Enable token compression for this request. If `true`, the request will be compressed to the compression rate specified in the API key settings. If `false`, the request will not be compressed. |
| `CompressionRate` | `float64` | The compression rate to use for this request. If `EnableCompression` is `true`, this value will be used to compress the request. The value should be between 0.0 and 1.0. The default value is 0.75. |
| `CompressionModel` | `*string` | Compression model for this request: `"agentic"`, `"claude"`, `"opencode"`, `"cursor"`, or `"customer"`. Each model is a bundle of compression strategies. Overrides API key settings when present. |
| `CompressionConfiguration` | `*CompressionConfiguration` | Configuration for the compression model. Currently only available for `agentic`. Contains optional `Rate` (0.0-1.0, default 0.8) and `SemanticPreservationThreshold` (0-100). |

**Example with InputObject:**

Expand Down Expand Up @@ -235,8 +235,10 @@ response, err := client.Send("gpt-5.2", edgee.InputObject{
Messages: []edgee.Message{
{Role: "user", Content: "Analyze this long document with lots of context..."},
},
EnableCompression: true,
CompressionRate: 0.8,
CompressionModel: edgee.String("agentic"),
CompressionConfiguration: &edgee.CompressionConfiguration{
Rate: edgee.Float64(0.8),
},
})
if err != nil {
log.Fatal(err)
Expand Down
4 changes: 2 additions & 2 deletions sdk/go/stream.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -57,8 +57,8 @@ When `input` is an `InputObject` or `map[string]interface{}`, you have full cont
| `Tools` | `[]Tool` | Array of function tools available to the model |
| `ToolChoice` | `any` | Controls which tool (if any) the model should call. See [Tools documentation](/sdk/go/tools) for details |
| `Tags` | `[]string` | Optional tags to categorize and label the request for analytics and filtering. Can also be sent via the `x-edgee-tags` header (comma-separated) |
| `EnableCompression` | `bool` | Enable token compression for this request. If `true`, the request will be compressed to the compression rate specified in the API key settings. If `false`, the request will not be compressed. |
| `CompressionRate` | `float64` | The compression rate to use for this request. If `EnableCompression` is `true`, this value will be used to compress the request. The value should be between 0.0 and 1.0. The default value is 0.75. |
| `CompressionModel` | `*string` | Compression model for this request: `"agentic"`, `"claude"`, `"opencode"`, `"cursor"`, or `"customer"`. Each model is a bundle of compression strategies. Overrides API key settings when present. |
| `CompressionConfiguration` | `*CompressionConfiguration` | Configuration for the compression model. Currently only available for `agentic`. Contains optional `Rate` (0.0-1.0, default 0.8) and `SemanticPreservationThreshold` (0-100). |

For details about `Message` type, see the [Send Method documentation](/sdk/go/send#message-object).
For details about `Tool` and `ToolChoice` types, see the [Tools documentation](/sdk/go/tools).
Expand Down
8 changes: 4 additions & 4 deletions sdk/python/send.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,8 @@ When `input` is an `InputObject` or dictionary, you have full control over the c
| `tools` | `list[dict] \| None` | Array of function tools available to the model |
| `tool_choice` | `str \| dict \| None` | Controls which tool (if any) the model should call. See [Tools documentation](/sdk/python/tools) for details |
| `tags` | `list[str] \| None` | Optional tags to categorize and label the request for analytics and filtering. Can also be sent via the `x-edgee-tags` header (comma-separated) |
| `enable_compression` | `bool` | Enable token compression for this request. If `true`, the request will be compressed to the compression rate specified in the API key settings. If `false`, the request will not be compressed. |
| `compression_rate` | `float` | The compression rate to use for this request. If `enable_compression` is `true`, this value will be used to compress the request. The value should be between 0.0 and 1.0. The default value is 0.75. |
| `compression_model` | `str` | Compression model for this request: `"agentic"`, `"claude"`, `"opencode"`, `"cursor"`, or `"customer"`. Each model is a bundle of compression strategies. Overrides API key settings when present. |
| `compression_configuration` | `dict` | Configuration for the compression model. Currently only available for `agentic`. Contains optional `rate` (0.0-1.0, default 0.8) and `semantic_preservation_threshold` (0-100). |

**Example with Dictionary Input:**

Expand Down Expand Up @@ -206,8 +206,8 @@ response = edgee.send(
"messages": [
{"role": "user", "content": "Analyze this long document with lots of context..."}
],
"enable_compression": True,
"compression_rate": 0.8
"compression_model": "agentic",
"compression_configuration": {"rate": 0.8}
}
)

Expand Down
4 changes: 2 additions & 2 deletions sdk/python/stream.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,8 @@ When `input` is an `InputObject` or dictionary, you have full control over the c
| `tools` | `list[dict] \| None` | Array of function tools available to the model |
| `tool_choice` | `str \| dict \| None` | Controls which tool (if any) the model should call. See [Tools documentation](/sdk/python/tools) for details |
| `tags` | `list[str] \| None` | Optional tags to categorize and label the request for analytics and filtering. Can also be sent via the `x-edgee-tags` header (comma-separated) |
| `enable_compression` | `bool` | Enable token compression for this request. If `true`, the request will be compressed to the compression rate specified in the API key settings. If `false`, the request will not be compressed. |
| `compression_rate` | `float` | The compression rate to use for this request. If `enable_compression` is `true`, this value will be used to compress the request. The value should be between 0.0 and 1.0. The default value is 0.75. |
| `compression_model` | `str` | Compression model for this request: `"agentic"`, `"claude"`, `"opencode"`, `"cursor"`, or `"customer"`. Each model is a bundle of compression strategies. Overrides API key settings when present. |
| `compression_configuration` | `dict` | Configuration for the compression model. Currently only available for `agentic`. Contains optional `rate` (0.0-1.0, default 0.8) and `semantic_preservation_threshold` (0-100). |

For details about `Message` type, see the [Send Method documentation](/sdk/python/send#message-object).
For details about `Tool` and `ToolChoice` types, see the [Tools documentation](/sdk/python/tools).
Expand Down
8 changes: 4 additions & 4 deletions sdk/rust/send.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -57,8 +57,8 @@ When `input` is an `InputObject`, you have full control over the conversation:
| `tools` | `Option<Vec<Tool>>` | Array of function tools available to the model |
| `tool_choice` | `Option<serde_json::Value>` | Controls which tool (if any) the model should call. See [Tools documentation](/sdk/rust/tools) for details |
| `tags` | `Option<Vec<String>>` | Optional tags to categorize and label the request for analytics and filtering. Can also be sent via the `x-edgee-tags` header (comma-separated) |
| `enable_compression` | `Option<bool>` | Enable token compression for this request (overrides console settings). If not set, uses the configuration from your API key or organization settings |
| `compression_rate` | `Option<f64>` | Target compression rate (0.0-1.0, default 0.75). Only used if compression is enabled. Higher values attempt more aggressive compression |
| `compression_model` | `Option<String>` | Compression model for this request: `"agentic"`, `"claude"`, `"opencode"`, `"cursor"`, or `"customer"`. Each model is a bundle of compression strategies. Overrides API key settings when present. |
| `compression_configuration` | `Option<CompressionConfiguration>` | Configuration for the compression model. Currently only available for `agentic`. Contains optional `rate` (0.0-1.0, default 0.8) and `semantic_preservation_threshold` (0-100). |

**Example with InputObject:**

Expand Down Expand Up @@ -217,8 +217,8 @@ Token compression metrics (when compression is applied):
let input = InputObject::new(vec![
Message::user("Analyze this long document with lots of context...")
])
.with_compression(true)
.with_compression_rate(0.8); // Target 80% compression
.with_compression_model("agentic")
.with_compression_configuration(CompressionConfiguration { rate: Some(0.8), semantic_preservation_threshold: None });

let response = client.send("gpt-5.2", input).await?;
println!("{}", response.text().unwrap_or(""));
Expand Down
4 changes: 2 additions & 2 deletions sdk/rust/stream.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -55,8 +55,8 @@ When `input` is a `Vec<Message>` or `InputObject`, you have full control over th
| `tools` | `Option<Vec<Tool>>` | Array of function tools available to the model |
| `tool_choice` | `Option<serde_json::Value>` | Controls which tool (if any) the model should call. See [Tools documentation](/sdk/rust/tools) for details |
| `tags` | `Option<Vec<String>>` | Optional tags to categorize and label the request for analytics and filtering. Can also be sent via the `x-edgee-tags` header (comma-separated) |
| `enable_compression` | `Option<bool>` | Enable token compression for this request (overrides console settings). If not set, uses the configuration from your API key or organization settings |
| `compression_rate` | `Option<f64>` | Target compression rate (0.0-1.0, default 0.75). Only used if compression is enabled. Higher values attempt more aggressive compression |
| `compression_model` | `Option<String>` | Compression model for this request: `"agentic"`, `"claude"`, `"opencode"`, `"cursor"`, or `"customer"`. Each model is a bundle of compression strategies. Overrides API key settings when present. |
| `compression_configuration` | `Option<CompressionConfiguration>` | Configuration for the compression model. Currently only available for `agentic`. Contains optional `rate` (0.0-1.0, default 0.8) and `semantic_preservation_threshold` (0-100). |

For details about `Message` type, see the [Send Method documentation](/sdk/rust/send#message-object).
For details about `Tool` and `ToolChoice` types, see the [Tools documentation](/sdk/rust/tools).
Expand Down
8 changes: 4 additions & 4 deletions sdk/typescript/send.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,8 @@ When `input` is an `InputObject`, you have full control over the conversation:
| `tools` | `Tool[]` | Array of function tools available to the model |
| `tool_choice` | `ToolChoice` | Controls which tool (if any) the model should call. See [Tools documentation](/sdk/typescript/tools) for details |
| `tags` | `string[]` | Optional tags to categorize and label the request for analytics and filtering. Can also be sent via the `x-edgee-tags` header (comma-separated) |
| `enable_compression` | `bool` | Enable token compression for this request. If `true`, the request will be compressed to the compression rate specified in the API key settings. If `false`, the request will not be compressed. |
| `compression_rate` | `number` | The compression rate to use for this request. If `enable_compression` is `true`, this value will be used to compress the request. The value should be between 0.0 and 1.0. The default value is 0.75. |
| `compression_model` | `string` | Compression model for this request: `"agentic"`, `"claude"`, `"opencode"`, `"cursor"`, or `"customer"`. Each model is a bundle of compression strategies. Overrides API key settings when present. |
| `compression_configuration` | `object` | Configuration for the compression model. Currently only available for `agentic`. Contains optional `rate` (0.0-1.0, default 0.8) and `semantic_preservation_threshold` (0-100). |

**Example with InputObject:**

Expand Down Expand Up @@ -211,8 +211,8 @@ const response = await edgee.send({
messages: [
{ role: 'user', content: 'Analyze this long document with lots of context...' }
],
enable_compression: true,
compression_rate: 0.8
compression_model: "agentic",
compression_configuration: { rate: 0.8 }
}
});

Expand Down
4 changes: 2 additions & 2 deletions sdk/typescript/stream.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,8 @@ When `input` is an `InputObject`, you have full control over the conversation:
| `tools` | `Tool[]` | Array of function tools available to the model |
| `tool_choice` | `ToolChoice` | Controls which tool (if any) the model should call. See [Tools documentation](/sdk/typescript/tools) for details |
| `tags` | `string[]` | Optional tags to categorize and label the request for analytics and filtering. Can also be sent via the `x-edgee-tags` header (comma-separated) |
| `enable_compression` | `bool` | Enable token compression for this request. If `true`, the request will be compressed to the compression rate specified in the API key settings. If `false`, the request will not be compressed. |
| `compression_rate` | `number` | The compression rate to use for this request. If `enable_compression` is `true`, this value will be used to compress the request. The value should be between 0.0 and 1.0. The default value is 0.75. |
| `compression_model` | `string` | Compression model for this request: `"agentic"`, `"claude"`, `"opencode"`, `"cursor"`, or `"customer"`. Each model is a bundle of compression strategies. Overrides API key settings when present. |
| `compression_configuration` | `object` | Configuration for the compression model. Currently only available for `agentic`. Contains optional `rate` (0.0-1.0, default 0.8) and `semantic_preservation_threshold` (0-100). |

For details about `Message` type, see the [Send Method documentation](/sdk/typescript/send#message-object).
For details about `Tool` and `ToolChoice` types, see the [Tools documentation](/sdk/typescript/tools).
Expand Down