Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,9 @@ partial void ProcessOpenaiChatCompletionsResponseContent(
/// <param name="reasoningEffort">
/// Constrains effort on reasoning for reasoning models. Currently supported values are none, low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. Setting to none disables reasoning entirely if the model supports.
/// </param>
/// <param name="promptCacheKey">
/// A key to identify prompt cache for reuse across requests. If provided, the prompt will be cached and can be reused in subsequent requests with the same key.
/// </param>
/// <param name="cancellationToken">The token to cancel the operation with</param>
/// <exception cref="global::System.InvalidOperationException"></exception>
public async global::System.Threading.Tasks.Task<string> OpenaiChatCompletionsAsync(
Expand Down Expand Up @@ -333,6 +336,7 @@ partial void ProcessOpenaiChatCompletionsResponseContent(
bool? logprobs = default,
global::DeepInfra.StreamOptions? streamOptions = default,
global::DeepInfra.OpenAIChatCompletionsInReasoningEffort? reasoningEffort = default,
string? promptCacheKey = default,
global::System.Threading.CancellationToken cancellationToken = default)
{
var __request = new global::DeepInfra.OpenAIChatCompletionsIn
Expand All @@ -358,6 +362,7 @@ partial void ProcessOpenaiChatCompletionsResponseContent(
Logprobs = logprobs,
StreamOptions = streamOptions,
ReasoningEffort = reasoningEffort,
PromptCacheKey = promptCacheKey,
};

return await OpenaiChatCompletionsAsync(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,9 @@ partial void ProcessOpenaiChatCompletions2ResponseContent(
/// <param name="reasoningEffort">
/// Constrains effort on reasoning for reasoning models. Currently supported values are none, low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. Setting to none disables reasoning entirely if the model supports.
/// </param>
/// <param name="promptCacheKey">
/// A key to identify prompt cache for reuse across requests. If provided, the prompt will be cached and can be reused in subsequent requests with the same key.
/// </param>
/// <param name="cancellationToken">The token to cancel the operation with</param>
/// <exception cref="global::System.InvalidOperationException"></exception>
public async global::System.Threading.Tasks.Task<string> OpenaiChatCompletions2Async(
Expand Down Expand Up @@ -333,6 +336,7 @@ partial void ProcessOpenaiChatCompletions2ResponseContent(
bool? logprobs = default,
global::DeepInfra.StreamOptions? streamOptions = default,
global::DeepInfra.OpenAIChatCompletionsInReasoningEffort? reasoningEffort = default,
string? promptCacheKey = default,
global::System.Threading.CancellationToken cancellationToken = default)
{
var __request = new global::DeepInfra.OpenAIChatCompletionsIn
Expand All @@ -358,6 +362,7 @@ partial void ProcessOpenaiChatCompletions2ResponseContent(
Logprobs = logprobs,
StreamOptions = streamOptions,
ReasoningEffort = reasoningEffort,
PromptCacheKey = promptCacheKey,
};

return await OpenaiChatCompletions2Async(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,9 @@ public partial interface IDeepInfraClient
/// <param name="reasoningEffort">
/// Constrains effort on reasoning for reasoning models. Currently supported values are none, low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. Setting to none disables reasoning entirely if the model supports.
/// </param>
/// <param name="promptCacheKey">
/// A key to identify prompt cache for reuse across requests. If provided, the prompt will be cached and can be reused in subsequent requests with the same key.
/// </param>
/// <param name="cancellationToken">The token to cancel the operation with</param>
/// <exception cref="global::System.InvalidOperationException"></exception>
global::System.Threading.Tasks.Task<string> OpenaiChatCompletionsAsync(
Expand Down Expand Up @@ -123,6 +126,7 @@ public partial interface IDeepInfraClient
bool? logprobs = default,
global::DeepInfra.StreamOptions? streamOptions = default,
global::DeepInfra.OpenAIChatCompletionsInReasoningEffort? reasoningEffort = default,
string? promptCacheKey = default,
global::System.Threading.CancellationToken cancellationToken = default);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,9 @@ public partial interface IDeepInfraClient
/// <param name="reasoningEffort">
/// Constrains effort on reasoning for reasoning models. Currently supported values are none, low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. Setting to none disables reasoning entirely if the model supports.
/// </param>
/// <param name="promptCacheKey">
/// A key to identify prompt cache for reuse across requests. If provided, the prompt will be cached and can be reused in subsequent requests with the same key.
/// </param>
/// <param name="cancellationToken">The token to cancel the operation with</param>
/// <exception cref="global::System.InvalidOperationException"></exception>
global::System.Threading.Tasks.Task<string> OpenaiChatCompletions2Async(
Expand Down Expand Up @@ -123,6 +126,7 @@ public partial interface IDeepInfraClient
bool? logprobs = default,
global::DeepInfra.StreamOptions? streamOptions = default,
global::DeepInfra.OpenAIChatCompletionsInReasoningEffort? reasoningEffort = default,
string? promptCacheKey = default,
global::System.Threading.CancellationToken cancellationToken = default);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,12 @@ public sealed partial class OpenAIChatCompletionsIn
[global::System.Text.Json.Serialization.JsonConverter(typeof(global::DeepInfra.JsonConverters.OpenAIChatCompletionsInReasoningEffortJsonConverter))]
public global::DeepInfra.OpenAIChatCompletionsInReasoningEffort? ReasoningEffort { get; set; }

/// <summary>
/// A key to identify prompt cache for reuse across requests. If provided, the prompt will be cached and can be reused in subsequent requests with the same key.
/// </summary>
[global::System.Text.Json.Serialization.JsonPropertyName("prompt_cache_key")]
public string? PromptCacheKey { get; set; }

/// <summary>
/// Additional properties that are not explicitly defined in the schema
/// </summary>
Expand Down Expand Up @@ -232,6 +238,9 @@ public sealed partial class OpenAIChatCompletionsIn
/// <param name="reasoningEffort">
/// Constrains effort on reasoning for reasoning models. Currently supported values are none, low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. Setting to none disables reasoning entirely if the model supports.
/// </param>
/// <param name="promptCacheKey">
/// A key to identify prompt cache for reuse across requests. If provided, the prompt will be cached and can be reused in subsequent requests with the same key.
/// </param>
#if NET7_0_OR_GREATER
[global::System.Diagnostics.CodeAnalysis.SetsRequiredMembers]
#endif
Expand All @@ -256,7 +265,8 @@ public OpenAIChatCompletionsIn(
int? seed,
bool? logprobs,
global::DeepInfra.StreamOptions? streamOptions,
global::DeepInfra.OpenAIChatCompletionsInReasoningEffort? reasoningEffort)
global::DeepInfra.OpenAIChatCompletionsInReasoningEffort? reasoningEffort,
string? promptCacheKey)
{
this.Model = model ?? throw new global::System.ArgumentNullException(nameof(model));
this.Messages = messages ?? throw new global::System.ArgumentNullException(nameof(messages));
Expand All @@ -279,6 +289,7 @@ public OpenAIChatCompletionsIn(
this.Logprobs = logprobs;
this.StreamOptions = streamOptions;
this.ReasoningEffort = reasoningEffort;
this.PromptCacheKey = promptCacheKey;
}

/// <summary>
Expand Down
5 changes: 5 additions & 0 deletions src/libs/DeepInfra/openapi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7287,6 +7287,11 @@ components:
type: string
description: 'Constrains effort on reasoning for reasoning models. Currently supported values are none, low, medium, and high. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. Setting to none disables reasoning entirely if the model supports.'
nullable: true
prompt_cache_key:
title: Prompt Cache Key
type: string
description: 'A key to identify prompt cache for reuse across requests. If provided, the prompt will be cached and can be reused in subsequent requests with the same key.'
nullable: true
Comment on lines +7290 to +7294
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add length/charset constraints to prevent abuse and collisions

Constrain the key to a sane length and allowed chars; avoids accidental PII, log injection, and oversized headers/bodies.

         prompt_cache_key:
           title: Prompt Cache Key
           type: string
-          description: 'A key to identify prompt cache for reuse across requests. If provided, the prompt will be cached and can be reused in subsequent requests with the same key.'
+          description: 'A key to identify the prompt cache for reuse across requests. Scoped to the authenticated account/team. Case-sensitive.'
+          minLength: 1
+          maxLength: 256
+          pattern: '^[A-Za-z0-9._:-]+$'
           nullable: true
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
prompt_cache_key:
title: Prompt Cache Key
type: string
description: 'A key to identify prompt cache for reuse across requests. If provided, the prompt will be cached and can be reused in subsequent requests with the same key.'
nullable: true
prompt_cache_key:
title: Prompt Cache Key
type: string
description: 'A key to identify the prompt cache for reuse across requests. Scoped to the authenticated account/team. Case-sensitive.'
minLength: 1
maxLength: 256
pattern: '^[A-Za-z0-9._:-]+$'
nullable: true
🤖 Prompt for AI Agents
In src/libs/DeepInfra/openapi.yaml around lines 7290-7294, the prompt_cache_key
schema lacks length and charset constraints; add validation to limit size and
allowed characters to prevent PII, log injection, and oversized payloads. Update
the schema to include minLength (e.g. 1), maxLength (e.g. 128), and a
restrictive pattern that only permits safe characters (for example alphanumeric
and a small set of separators like _ - . :), and adjust the description to note
these limits; ensure the regex disallows spaces and control characters.

OpenAICompletionsIn:
title: OpenAICompletionsIn
required:
Expand Down