Request Headers
Authorization token (required).
Request Body
Parameters for creating a streaming chat completion.
Properties
A list of messages exchanged in a chat conversation.
Items
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by an end user, containing prompts or additional context information.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by tools in response to tool calls made by the assistant.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
The ID of the tool call that this message is responding to.
Messages sent by the model in response to user messages.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The refusal message by the assistant.
Tool calls made by the assistant.
Items
A function tool call made by the assistant.
Properties
The unique identifier for the tool call.
The name and arguments of the function called.
Properties
The name of the function called.
The arguments passed to the function.
The reasoning provided by the assistant.
Options for selecting the upstream provider of this completion.
Properties
Specifies whether to allow providers which collect data.
Variants
Whether to enforce Zero Data Retention (ZDR) policies when selecting providers.
Specifies the sorting strategy for provider selection.
Variants
Properties
Maximum price for prompt tokens.
Maximum price for completion tokens.
Maximum price for image generation.
Maximum price for audio generation.
Maximum price per request.
Preferred minimum throughput for the provider.
Preferred maximum latency for the provider.
Minimum throughput for the provider.
Maximum latency for the provider.
The Ensemble LLM to use for this completion. May be a unique ID or an inline definition.
Variants
An LLM to be used within an Ensemble or standalone with Chat Completions.
Properties
The full ID of the LLM to use.
For Vector Completions only, specifies the LLM's voting output mode. For "instruction", the assistant is instructed to output a key. For "json_schema", the assistant is constrained to output a valid key using a JSON schema. For "tool_call", the assistant is instructed to output a tool call to select the key.
Variants
For Vector Completions only, whether to use synthetic reasoning prior to voting. Works for any LLM, even those that do not have native reasoning capabilities.
For Vector Completions only, whether to use logprobs to make the vote probabilistic. This means that the LLM can vote for multiple keys based on their logprobabilities. Allows LLMs to express native uncertainty when voting.
A list of messages exchanged in a chat conversation. These will be prepended to every prompt sent to this LLM. Useful for setting context or influencing behavior.
Items
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by an end user, containing prompts or additional context information.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by tools in response to tool calls made by the assistant.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
The ID of the tool call that this message is responding to.
Messages sent by the model in response to user messages.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The refusal message by the assistant.
Tool calls made by the assistant.
Items
A function tool call made by the assistant.
Properties
The unique identifier for the tool call.
The name and arguments of the function called.
Properties
The name of the function called.
The arguments passed to the function.
The reasoning provided by the assistant.
A list of messages exchanged in a chat conversation. These will be appended to every prompt sent to this LLM. Useful for setting context or influencing behavior.
Items
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by an end user, containing prompts or additional context information.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by tools in response to tool calls made by the assistant.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
The ID of the tool call that this message is responding to.
Messages sent by the model in response to user messages.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The refusal message by the assistant.
Tool calls made by the assistant.
Items
A function tool call made by the assistant.
Properties
The unique identifier for the tool call.
The name and arguments of the function called.
Properties
The name of the function called.
The arguments passed to the function.
The reasoning provided by the assistant.
This setting aims to control the repetition of tokens based on how often they appear in the input. It tries to use less frequently those tokens that appear more in the input, proportional to how frequently they occur. Token penalty scales with the number of occurrences. Negative values will encourage token reuse.
Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.
Values
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
This setting aims to control the presence of tokens in the output. It tries to encourage the model to use tokens that are less present in the input, proportional to their presence in the input. Token presence scales with the number of occurrences. Negative values will encourage more diverse token usage.
The assistant will stop when any of the provided strings are generated.
Variants
Generation will stop when this string is generated.
Generation will stop when any of these strings are generated.
Items
This setting influences the variety in the model’s responses. Lower values lead to more predictable and typical responses, while higher values encourage more diverse and less common responses. At 0, the model always gives the same response for a given input.
This setting limits the model’s choices to a percentage of likely tokens: only the top tokens whose probabilities add up to P. A lower value makes the model’s responses more predictable, while the default setting allows for a full range of token choices. Think of it like a dynamic Top-K.
This sets the upper limit for the number of tokens the model can generate in response. It won’t produce more than this limit. The maximum value is the context length minus the prompt length.
Represents the minimum probability for a token to be considered, relative to the probability of the most likely token. (The value changes depending on the confidence level of the most probable token.) If your Min-P is set to 0.1, that means it will only allow for tokens that are at least 1/10th as probable as the best possible option.
Options for selecting the upstream provider of this model.
Properties
Whether to allow fallback providers if the preferred provider is unavailable.
Whether to require that the provider supports all specified parameters.
An ordered list of provider names to use when selecting a provider for this model.
Items
A list of provider names to restrict selection to when selecting a provider for this model.
Items
A list of provider names to ignore when selecting a provider for this model.
Items
Specifies the quantizations to allow when selecting providers for this model.
Items
Options for controlling reasoning behavior of the model.
Properties
Enables or disables reasoning for supported models.
The maximum number of tokens to use for reasoning in a response.
Constrains effort on reasoning for supported reasoning models. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
Variants
Controls the verbosity of the reasoning summary for supported reasoning models.
Variants
Helps to reduce the repetition of tokens from the input. A higher value makes the model less likely to repeat tokens, but too high a value can make the output less coherent (often with run-on sentences that lack small words). Token penalty scales based on original token’s probability.
Consider only the top tokens with “sufficiently high” probabilities based on the probability of the most likely token. Think of it like a dynamic Top-P. A lower Top-A value focuses the choices based on the highest probability token but with a narrower scope. A higher Top-A value does not necessarily affect the creativity of the output, but rather refines the filtering process based on the maximum probability.
This limits the model’s choice of tokens at each step, making it choose from a smaller set. A value of 1 means the model will always pick the most likely next token, leading to predictable results. By default this setting is disabled, making the model to consider all choices.
Controls the verbosity and length of the model response. Lower values produce more concise responses, while higher values produce more detailed and comprehensive responses.
Variants
Fallback Ensemble LLMs to use if the primary Ensemble LLM fails.
Items
An LLM to be used within an Ensemble or standalone with Chat Completions.
Properties
The full ID of the LLM to use.
For Vector Completions only, specifies the LLM's voting output mode. For "instruction", the assistant is instructed to output a key. For "json_schema", the assistant is constrained to output a valid key using a JSON schema. For "tool_call", the assistant is instructed to output a tool call to select the key.
Variants
For Vector Completions only, whether to use synthetic reasoning prior to voting. Works for any LLM, even those that do not have native reasoning capabilities.
For Vector Completions only, whether to use logprobs to make the vote probabilistic. This means that the LLM can vote for multiple keys based on their logprobabilities. Allows LLMs to express native uncertainty when voting.
A list of messages exchanged in a chat conversation. These will be prepended to every prompt sent to this LLM. Useful for setting context or influencing behavior.
Items
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by an end user, containing prompts or additional context information.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by tools in response to tool calls made by the assistant.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
The ID of the tool call that this message is responding to.
Messages sent by the model in response to user messages.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The refusal message by the assistant.
Tool calls made by the assistant.
Items
A function tool call made by the assistant.
Properties
The unique identifier for the tool call.
The name and arguments of the function called.
Properties
The name of the function called.
The arguments passed to the function.
The reasoning provided by the assistant.
A list of messages exchanged in a chat conversation. These will be appended to every prompt sent to this LLM. Useful for setting context or influencing behavior.
Items
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by an end user, containing prompts or additional context information.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by tools in response to tool calls made by the assistant.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
The ID of the tool call that this message is responding to.
Messages sent by the model in response to user messages.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The refusal message by the assistant.
Tool calls made by the assistant.
Items
A function tool call made by the assistant.
Properties
The unique identifier for the tool call.
The name and arguments of the function called.
Properties
The name of the function called.
The arguments passed to the function.
The reasoning provided by the assistant.
This setting aims to control the repetition of tokens based on how often they appear in the input. It tries to use less frequently those tokens that appear more in the input, proportional to how frequently they occur. Token penalty scales with the number of occurrences. Negative values will encourage token reuse.
Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.
Values
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
This setting aims to control the presence of tokens in the output. It tries to encourage the model to use tokens that are less present in the input, proportional to their presence in the input. Token presence scales with the number of occurrences. Negative values will encourage more diverse token usage.
The assistant will stop when any of the provided strings are generated.
Variants
Generation will stop when this string is generated.
Generation will stop when any of these strings are generated.
Items
This setting influences the variety in the model’s responses. Lower values lead to more predictable and typical responses, while higher values encourage more diverse and less common responses. At 0, the model always gives the same response for a given input.
This setting limits the model’s choices to a percentage of likely tokens: only the top tokens whose probabilities add up to P. A lower value makes the model’s responses more predictable, while the default setting allows for a full range of token choices. Think of it like a dynamic Top-K.
This sets the upper limit for the number of tokens the model can generate in response. It won’t produce more than this limit. The maximum value is the context length minus the prompt length.
Represents the minimum probability for a token to be considered, relative to the probability of the most likely token. (The value changes depending on the confidence level of the most probable token.) If your Min-P is set to 0.1, that means it will only allow for tokens that are at least 1/10th as probable as the best possible option.
Options for selecting the upstream provider of this model.
Properties
Whether to allow fallback providers if the preferred provider is unavailable.
Whether to require that the provider supports all specified parameters.
An ordered list of provider names to use when selecting a provider for this model.
Items
A list of provider names to restrict selection to when selecting a provider for this model.
Items
A list of provider names to ignore when selecting a provider for this model.
Items
Specifies the quantizations to allow when selecting providers for this model.
Items
Options for controlling reasoning behavior of the model.
Properties
Enables or disables reasoning for supported models.
The maximum number of tokens to use for reasoning in a response.
Constrains effort on reasoning for supported reasoning models. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
Variants
Controls the verbosity of the reasoning summary for supported reasoning models.
Variants
Helps to reduce the repetition of tokens from the input. A higher value makes the model less likely to repeat tokens, but too high a value can make the output less coherent (often with run-on sentences that lack small words). Token penalty scales based on original token’s probability.
Consider only the top tokens with “sufficiently high” probabilities based on the probability of the most likely token. Think of it like a dynamic Top-P. A lower Top-A value focuses the choices based on the highest probability token but with a narrower scope. A higher Top-A value does not necessarily affect the creativity of the output, but rather refines the filtering process based on the maximum probability.
This limits the model’s choice of tokens at each step, making it choose from a smaller set. A value of 1 means the model will always pick the most likely next token, leading to predictable results. By default this setting is disabled, making the model to consider all choices.
Controls the verbosity and length of the model response. Lower values produce more concise responses, while higher values produce more detailed and comprehensive responses.
Variants
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability.
The desired format of the model's response.
Variants
The response will be arbitrary text.
Properties
The response will be a JSON object.
Properties
The response will conform to the provided JSON schema.
Properties
A JSON schema definition for constraining model output.
Properties
The name of the JSON schema.
The description of the JSON schema.
The JSON schema definition.
Whether to enforce strict adherence to the JSON schema.
The response will conform to the provided grammar definition.
Properties
The grammar definition to constrain the response.
The response will be Python code.
Properties
If specified, upstream systems will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Specifies tool call behavior for the assistant.
Variants
Specify a function for the assistant to call.
Properties
Properties
The name of the function the assistant will call.
A list of tools that the assistant can call.
Items
A function tool that the assistant can call.
Properties
The definition of a function tool.
Properties
The name of the function.
The description of the function.
The JSON schema defining the parameters of the function.
Values
Items
A JSON value.
Values
A JSON value.
Whether to enforce strict adherence to the parameter schema.
Whether to allow the model to make multiple tool calls in parallel.
Configuration for a Predicted Output, which can greatly improve response times when large parts of the model response are known ahead of time. This is most common when you are regenerating a file with only minor changes to most of the content.
Properties
Variants
Items
A part of the predicted content.
Properties
The maximum total time in milliseconds to spend on retries when a transient error occurs.
The maximum time in milliseconds to wait for the first chunk of a streaming response.
The maximum time in milliseconds to wait between subsequent chunks of a streaming response.
Whether to stream the response as a series of chunks.
Parameters for creating a unary chat completion.
Properties
A list of messages exchanged in a chat conversation.
Items
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by an end user, containing prompts or additional context information.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by tools in response to tool calls made by the assistant.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
The ID of the tool call that this message is responding to.
Messages sent by the model in response to user messages.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The refusal message by the assistant.
Tool calls made by the assistant.
Items
A function tool call made by the assistant.
Properties
The unique identifier for the tool call.
The name and arguments of the function called.
Properties
The name of the function called.
The arguments passed to the function.
The reasoning provided by the assistant.
Options for selecting the upstream provider of this completion.
Properties
Specifies whether to allow providers which collect data.
Variants
Whether to enforce Zero Data Retention (ZDR) policies when selecting providers.
Specifies the sorting strategy for provider selection.
Variants
Properties
Maximum price for prompt tokens.
Maximum price for completion tokens.
Maximum price for image generation.
Maximum price for audio generation.
Maximum price per request.
Preferred minimum throughput for the provider.
Preferred maximum latency for the provider.
Minimum throughput for the provider.
Maximum latency for the provider.
The Ensemble LLM to use for this completion. May be a unique ID or an inline definition.
Variants
An LLM to be used within an Ensemble or standalone with Chat Completions.
Properties
The full ID of the LLM to use.
For Vector Completions only, specifies the LLM's voting output mode. For "instruction", the assistant is instructed to output a key. For "json_schema", the assistant is constrained to output a valid key using a JSON schema. For "tool_call", the assistant is instructed to output a tool call to select the key.
Variants
For Vector Completions only, whether to use synthetic reasoning prior to voting. Works for any LLM, even those that do not have native reasoning capabilities.
For Vector Completions only, whether to use logprobs to make the vote probabilistic. This means that the LLM can vote for multiple keys based on their logprobabilities. Allows LLMs to express native uncertainty when voting.
A list of messages exchanged in a chat conversation. These will be prepended to every prompt sent to this LLM. Useful for setting context or influencing behavior.
Items
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by an end user, containing prompts or additional context information.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by tools in response to tool calls made by the assistant.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
The ID of the tool call that this message is responding to.
Messages sent by the model in response to user messages.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The refusal message by the assistant.
Tool calls made by the assistant.
Items
A function tool call made by the assistant.
Properties
The unique identifier for the tool call.
The name and arguments of the function called.
Properties
The name of the function called.
The arguments passed to the function.
The reasoning provided by the assistant.
A list of messages exchanged in a chat conversation. These will be appended to every prompt sent to this LLM. Useful for setting context or influencing behavior.
Items
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by an end user, containing prompts or additional context information.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by tools in response to tool calls made by the assistant.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
The ID of the tool call that this message is responding to.
Messages sent by the model in response to user messages.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The refusal message by the assistant.
Tool calls made by the assistant.
Items
A function tool call made by the assistant.
Properties
The unique identifier for the tool call.
The name and arguments of the function called.
Properties
The name of the function called.
The arguments passed to the function.
The reasoning provided by the assistant.
This setting aims to control the repetition of tokens based on how often they appear in the input. It tries to use less frequently those tokens that appear more in the input, proportional to how frequently they occur. Token penalty scales with the number of occurrences. Negative values will encourage token reuse.
Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.
Values
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
This setting aims to control the presence of tokens in the output. It tries to encourage the model to use tokens that are less present in the input, proportional to their presence in the input. Token presence scales with the number of occurrences. Negative values will encourage more diverse token usage.
The assistant will stop when any of the provided strings are generated.
Variants
Generation will stop when this string is generated.
Generation will stop when any of these strings are generated.
Items
This setting influences the variety in the model’s responses. Lower values lead to more predictable and typical responses, while higher values encourage more diverse and less common responses. At 0, the model always gives the same response for a given input.
This setting limits the model’s choices to a percentage of likely tokens: only the top tokens whose probabilities add up to P. A lower value makes the model’s responses more predictable, while the default setting allows for a full range of token choices. Think of it like a dynamic Top-K.
This sets the upper limit for the number of tokens the model can generate in response. It won’t produce more than this limit. The maximum value is the context length minus the prompt length.
Represents the minimum probability for a token to be considered, relative to the probability of the most likely token. (The value changes depending on the confidence level of the most probable token.) If your Min-P is set to 0.1, that means it will only allow for tokens that are at least 1/10th as probable as the best possible option.
Options for selecting the upstream provider of this model.
Properties
Whether to allow fallback providers if the preferred provider is unavailable.
Whether to require that the provider supports all specified parameters.
An ordered list of provider names to use when selecting a provider for this model.
Items
A list of provider names to restrict selection to when selecting a provider for this model.
Items
A list of provider names to ignore when selecting a provider for this model.
Items
Specifies the quantizations to allow when selecting providers for this model.
Items
Options for controlling reasoning behavior of the model.
Properties
Enables or disables reasoning for supported models.
The maximum number of tokens to use for reasoning in a response.
Constrains effort on reasoning for supported reasoning models. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
Variants
Controls the verbosity of the reasoning summary for supported reasoning models.
Variants
Helps to reduce the repetition of tokens from the input. A higher value makes the model less likely to repeat tokens, but too high a value can make the output less coherent (often with run-on sentences that lack small words). Token penalty scales based on original token’s probability.
Consider only the top tokens with “sufficiently high” probabilities based on the probability of the most likely token. Think of it like a dynamic Top-P. A lower Top-A value focuses the choices based on the highest probability token but with a narrower scope. A higher Top-A value does not necessarily affect the creativity of the output, but rather refines the filtering process based on the maximum probability.
This limits the model’s choice of tokens at each step, making it choose from a smaller set. A value of 1 means the model will always pick the most likely next token, leading to predictable results. By default this setting is disabled, making the model to consider all choices.
Controls the verbosity and length of the model response. Lower values produce more concise responses, while higher values produce more detailed and comprehensive responses.
Variants
Fallback Ensemble LLMs to use if the primary Ensemble LLM fails.
Items
An LLM to be used within an Ensemble or standalone with Chat Completions.
Properties
The full ID of the LLM to use.
For Vector Completions only, specifies the LLM's voting output mode. For "instruction", the assistant is instructed to output a key. For "json_schema", the assistant is constrained to output a valid key using a JSON schema. For "tool_call", the assistant is instructed to output a tool call to select the key.
Variants
For Vector Completions only, whether to use synthetic reasoning prior to voting. Works for any LLM, even those that do not have native reasoning capabilities.
For Vector Completions only, whether to use logprobs to make the vote probabilistic. This means that the LLM can vote for multiple keys based on their logprobabilities. Allows LLMs to express native uncertainty when voting.
A list of messages exchanged in a chat conversation. These will be prepended to every prompt sent to this LLM. Useful for setting context or influencing behavior.
Items
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by an end user, containing prompts or additional context information.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by tools in response to tool calls made by the assistant.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
The ID of the tool call that this message is responding to.
Messages sent by the model in response to user messages.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The refusal message by the assistant.
Tool calls made by the assistant.
Items
A function tool call made by the assistant.
Properties
The unique identifier for the tool call.
The name and arguments of the function called.
Properties
The name of the function called.
The arguments passed to the function.
The reasoning provided by the assistant.
A list of messages exchanged in a chat conversation. These will be appended to every prompt sent to this LLM. Useful for setting context or influencing behavior.
Items
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by an end user, containing prompts or additional context information.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by tools in response to tool calls made by the assistant.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
The ID of the tool call that this message is responding to.
Messages sent by the model in response to user messages.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The refusal message by the assistant.
Tool calls made by the assistant.
Items
A function tool call made by the assistant.
Properties
The unique identifier for the tool call.
The name and arguments of the function called.
Properties
The name of the function called.
The arguments passed to the function.
The reasoning provided by the assistant.
This setting aims to control the repetition of tokens based on how often they appear in the input. It tries to use less frequently those tokens that appear more in the input, proportional to how frequently they occur. Token penalty scales with the number of occurrences. Negative values will encourage token reuse.
Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.
Values
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
This setting aims to control the presence of tokens in the output. It tries to encourage the model to use tokens that are less present in the input, proportional to their presence in the input. Token presence scales with the number of occurrences. Negative values will encourage more diverse token usage.
The assistant will stop when any of the provided strings are generated.
Variants
Generation will stop when this string is generated.
Generation will stop when any of these strings are generated.
Items
This setting influences the variety in the model’s responses. Lower values lead to more predictable and typical responses, while higher values encourage more diverse and less common responses. At 0, the model always gives the same response for a given input.
This setting limits the model’s choices to a percentage of likely tokens: only the top tokens whose probabilities add up to P. A lower value makes the model’s responses more predictable, while the default setting allows for a full range of token choices. Think of it like a dynamic Top-K.
This sets the upper limit for the number of tokens the model can generate in response. It won’t produce more than this limit. The maximum value is the context length minus the prompt length.
Represents the minimum probability for a token to be considered, relative to the probability of the most likely token. (The value changes depending on the confidence level of the most probable token.) If your Min-P is set to 0.1, that means it will only allow for tokens that are at least 1/10th as probable as the best possible option.
Options for selecting the upstream provider of this model.
Properties
Whether to allow fallback providers if the preferred provider is unavailable.
Whether to require that the provider supports all specified parameters.
An ordered list of provider names to use when selecting a provider for this model.
Items
A list of provider names to restrict selection to when selecting a provider for this model.
Items
A list of provider names to ignore when selecting a provider for this model.
Items
Specifies the quantizations to allow when selecting providers for this model.
Items
Options for controlling reasoning behavior of the model.
Properties
Enables or disables reasoning for supported models.
The maximum number of tokens to use for reasoning in a response.
Constrains effort on reasoning for supported reasoning models. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
Variants
Controls the verbosity of the reasoning summary for supported reasoning models.
Variants
Helps to reduce the repetition of tokens from the input. A higher value makes the model less likely to repeat tokens, but too high a value can make the output less coherent (often with run-on sentences that lack small words). Token penalty scales based on original token’s probability.
Consider only the top tokens with “sufficiently high” probabilities based on the probability of the most likely token. Think of it like a dynamic Top-P. A lower Top-A value focuses the choices based on the highest probability token but with a narrower scope. A higher Top-A value does not necessarily affect the creativity of the output, but rather refines the filtering process based on the maximum probability.
This limits the model’s choice of tokens at each step, making it choose from a smaller set. A value of 1 means the model will always pick the most likely next token, leading to predictable results. By default this setting is disabled, making the model to consider all choices.
Controls the verbosity and length of the model response. Lower values produce more concise responses, while higher values produce more detailed and comprehensive responses.
Variants
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability.
The desired format of the model's response.
Variants
The response will be arbitrary text.
Properties
The response will be a JSON object.
Properties
The response will conform to the provided JSON schema.
Properties
A JSON schema definition for constraining model output.
Properties
The name of the JSON schema.
The description of the JSON schema.
The JSON schema definition.
Whether to enforce strict adherence to the JSON schema.
The response will conform to the provided grammar definition.
Properties
The grammar definition to constrain the response.
The response will be Python code.
Properties
If specified, upstream systems will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Specifies tool call behavior for the assistant.
Variants
Specify a function for the assistant to call.
Properties
Properties
The name of the function the assistant will call.
A list of tools that the assistant can call.
Items
A function tool that the assistant can call.
Properties
The definition of a function tool.
Properties
The name of the function.
The description of the function.
The JSON schema defining the parameters of the function.
Values
Items
A JSON value.
Values
A JSON value.
Whether to enforce strict adherence to the parameter schema.
Whether to allow the model to make multiple tool calls in parallel.
Configuration for a Predicted Output, which can greatly improve response times when large parts of the model response are known ahead of time. This is most common when you are regenerating a file with only minor changes to most of the content.
Properties
Variants
Items
A part of the predicted content.
Properties
The maximum total time in milliseconds to spend on retries when a transient error occurs.
The maximum time in milliseconds to wait for the first chunk of a streaming response.
The maximum time in milliseconds to wait between subsequent chunks of a streaming response.
Whether to stream the response as a series of chunks.
Response Body
The unique identifier of the chat completion.
The unique identifier of the upstream chat completion.
The list of choices in this chat completion.
Items
A choice in a unary chat completion response.
Properties
A message generated by the assistant.
Properties
The content of the message.
The refusal message, if any.
The role of the message author.
Variants
The tool calls made by the assistant, if any.
Items
A function tool call made by the assistant.
Properties
The unique identifier of the function tool.
Properties
The name of the function.
The arguments passed to the function.
The reasoning provided by the assistant, if any.
The images generated by the assistant, if any.
Items
Properties
Properties
The Base64 URL of the generated image.
The reason why the assistant ceased to generate further tokens.
Variants
The index of the choice in the list of choices.
The log probabilities of the tokens generated by the model.
Properties
The log probabilities of the tokens in the content.
Items
The token which was selected by the sampler for this position as well as the logprobabilities of the top options.
Properties
The token string which was selected by the sampler.
The byte representation of the token which was selected by the sampler.
Items
The log probability of the token which was selected by the sampler.
The log probabilities of the top tokens for this position.
Items
The log probability of a token in the list of top tokens.
Properties
The token string.
The byte representation of the token.
Items
The log probability of the token.
The log probabilities of the tokens in the refusal.
Items
The token which was selected by the sampler for this position as well as the logprobabilities of the top options.
Properties
The token string which was selected by the sampler.
The byte representation of the token which was selected by the sampler.
Items
The log probability of the token which was selected by the sampler.
The log probabilities of the top tokens for this position.
Items
The log probability of a token in the list of top tokens.
Properties
The token string.
The byte representation of the token.
Items
The log probability of the token.
The Unix timestamp (in seconds) when the chat completion was created.
The unique identifier of the Ensemble LLM used for this chat completion.
The upstream model used for this chat completion.
Token and cost usage statistics for the completion.
Properties
The number of tokens generated in the completion.
The number of tokens in the prompt.
The total number of tokens used in the prompt or generated in the completion.
Detailed breakdown of generated completion tokens.
Properties
The number of accepted prediction tokens in the completion.
The number of generated audio tokens in the completion.
The number of generated reasoning tokens in the completion.
The number of rejected prediction tokens in the completion.
Detailed breakdown of prompt tokens.
Properties
The number of audio tokens in the prompt.
The number of cached tokens in the prompt.
The number of prompt tokens written to cache.
The number of video tokens in the prompt.
The cost in credits incurred for this completion.
Detailed breakdown of upstream costs incurred.
Properties
The cost incurred upstream.
The cost incurred by upstream's upstream.
The total cost in credits incurred including upstream costs.
The cost multiplier applied to upstream costs for computing ObjectiveAI costs.
Whether the completion used a BYOK (Bring Your Own Key) API Key.
The provider used for this chat completion.
Response Body (Streaming)
The unique identifier of the chat completion.
The unique identifier of the upstream chat completion.
The list of choices in this chunk.
Items
A choice in a streaming chat completion response.
Properties
A delta in a streaming chat completion response.
Properties
The content added in this delta.
The refusal message added in this delta.
The role of the message author.
Variants
Tool calls made in this delta.
Items
A function tool call made by the assistant.
Properties
The index of the tool call in the sequence of tool calls.
The unique identifier of the function tool.
Properties
The name of the function.
The arguments passed to the function.
The reasoning added in this delta.
Images added in this delta.
Items
Properties
Properties
The Base64 URL of the generated image.
The reason why the assistant ceased to generate further tokens.
Variants
The index of the choice in the list of choices.
The log probabilities of the tokens generated by the model.
Properties
The log probabilities of the tokens in the content.
Items
The token which was selected by the sampler for this position as well as the logprobabilities of the top options.
Properties
The token string which was selected by the sampler.
The byte representation of the token which was selected by the sampler.
Items
The log probability of the token which was selected by the sampler.
The log probabilities of the top tokens for this position.
Items
The log probability of a token in the list of top tokens.
Properties
The token string.
The byte representation of the token.
Items
The log probability of the token.
The log probabilities of the tokens in the refusal.
Items
The token which was selected by the sampler for this position as well as the logprobabilities of the top options.
Properties
The token string which was selected by the sampler.
The byte representation of the token which was selected by the sampler.
Items
The log probability of the token which was selected by the sampler.
The log probabilities of the top tokens for this position.
Items
The log probability of a token in the list of top tokens.
Properties
The token string.
The byte representation of the token.
Items
The log probability of the token.
The Unix timestamp (in seconds) when the chat completion was created.
The unique identifier of the Ensemble LLM used for this chat completion.
The upstream model used for this chat completion.
Token and cost usage statistics for the completion.
Properties
The number of tokens generated in the completion.
The number of tokens in the prompt.
The total number of tokens used in the prompt or generated in the completion.
Detailed breakdown of generated completion tokens.
Properties
The number of accepted prediction tokens in the completion.
The number of generated audio tokens in the completion.
The number of generated reasoning tokens in the completion.
The number of rejected prediction tokens in the completion.
Detailed breakdown of prompt tokens.
Properties
The number of audio tokens in the prompt.
The number of cached tokens in the prompt.
The number of prompt tokens written to cache.
The number of video tokens in the prompt.
The cost in credits incurred for this completion.
Detailed breakdown of upstream costs incurred.
Properties
The cost incurred upstream.
The cost incurred by upstream's upstream.
The total cost in credits incurred including upstream costs.
The cost multiplier applied to upstream costs for computing ObjectiveAI costs.
Whether the completion used a BYOK (Bring Your Own Key) API Key.
The provider used for this chat completion.