Request Path
The ID of the Ensemble.
Response Body
The unique identifier for the Ensemble.
The list of LLMs that make up the ensemble.
Items
An LLM to be used within an Ensemble, including its unique identifier, optional fallbacks, and count.
Properties
The full ID of the LLM to use.
For Vector Completions only, specifies the LLM's voting output mode. For "instruction", the assistant is instructed to output a key. For "json_schema", the assistant is constrained to output a valid key using a JSON schema. For "tool_call", the assistant is instructed to output a tool call to select the key.
Variants
For Vector Completions only, whether to use synthetic reasoning prior to voting. Works for any LLM, even those that do not have native reasoning capabilities.
For Vector Completions only, whether to use logprobs to make the vote probabilistic. This means that the LLM can vote for multiple keys based on their logprobabilities. Allows LLMs to express native uncertainty when voting.
A list of messages exchanged in a chat conversation. These will be prepended to every prompt sent to this LLM. Useful for setting context or influencing behavior.
Items
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by an end user, containing prompts or additional context information.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by tools in response to tool calls made by the assistant.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
The ID of the tool call that this message is responding to.
Messages sent by the model in response to user messages.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The refusal message by the assistant.
Tool calls made by the assistant.
Items
A function tool call made by the assistant.
Properties
The unique identifier for the tool call.
The name and arguments of the function called.
Properties
The name of the function called.
The arguments passed to the function.
The reasoning provided by the assistant.
A list of messages exchanged in a chat conversation. These will be appended to every prompt sent to this LLM. Useful for setting context or influencing behavior.
Items
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by an end user, containing prompts or additional context information.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by tools in response to tool calls made by the assistant.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
The ID of the tool call that this message is responding to.
Messages sent by the model in response to user messages.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The refusal message by the assistant.
Tool calls made by the assistant.
Items
A function tool call made by the assistant.
Properties
The unique identifier for the tool call.
The name and arguments of the function called.
Properties
The name of the function called.
The arguments passed to the function.
The reasoning provided by the assistant.
This setting aims to control the repetition of tokens based on how often they appear in the input. It tries to use less frequently those tokens that appear more in the input, proportional to how frequently they occur. Token penalty scales with the number of occurrences. Negative values will encourage token reuse.
Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.
Values
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
This setting aims to control the presence of tokens in the output. It tries to encourage the model to use tokens that are less present in the input, proportional to their presence in the input. Token presence scales with the number of occurrences. Negative values will encourage more diverse token usage.
The assistant will stop when any of the provided strings are generated.
Variants
Generation will stop when this string is generated.
Generation will stop when any of these strings are generated.
Items
This setting influences the variety in the model’s responses. Lower values lead to more predictable and typical responses, while higher values encourage more diverse and less common responses. At 0, the model always gives the same response for a given input.
This setting limits the model’s choices to a percentage of likely tokens: only the top tokens whose probabilities add up to P. A lower value makes the model’s responses more predictable, while the default setting allows for a full range of token choices. Think of it like a dynamic Top-K.
This sets the upper limit for the number of tokens the model can generate in response. It won’t produce more than this limit. The maximum value is the context length minus the prompt length.
Represents the minimum probability for a token to be considered, relative to the probability of the most likely token. (The value changes depending on the confidence level of the most probable token.) If your Min-P is set to 0.1, that means it will only allow for tokens that are at least 1/10th as probable as the best possible option.
Options for selecting the upstream provider of this model.
Properties
Whether to allow fallback providers if the preferred provider is unavailable.
Whether to require that the provider supports all specified parameters.
An ordered list of provider names to use when selecting a provider for this model.
Items
A list of provider names to restrict selection to when selecting a provider for this model.
Items
A list of provider names to ignore when selecting a provider for this model.
Items
Specifies the quantizations to allow when selecting providers for this model.
Items
Options for controlling reasoning behavior of the model.
Properties
Enables or disables reasoning for supported models.
The maximum number of tokens to use for reasoning in a response.
Constrains effort on reasoning for supported reasoning models. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
Variants
Controls the verbosity of the reasoning summary for supported reasoning models.
Variants
Helps to reduce the repetition of tokens from the input. A higher value makes the model less likely to repeat tokens, but too high a value can make the output less coherent (often with run-on sentences that lack small words). Token penalty scales based on original token’s probability.
Consider only the top tokens with “sufficiently high” probabilities based on the probability of the most likely token. Think of it like a dynamic Top-P. A lower Top-A value focuses the choices based on the highest probability token but with a narrower scope. A higher Top-A value does not necessarily affect the creativity of the output, but rather refines the filtering process based on the maximum probability.
This limits the model’s choice of tokens at each step, making it choose from a smaller set. A value of 1 means the model will always pick the most likely next token, leading to predictable results. By default this setting is disabled, making the model to consider all choices.
Controls the verbosity and length of the model response. Lower values produce more concise responses, while higher values produce more detailed and comprehensive responses.
Variants
The unique identifier for the Ensemble LLM.
A count greater than one effectively means that there are multiple instances of this LLM in an ensemble.
A list of fallback LLMs to use if the primary LLM fails.
Items
An LLM to be used within an Ensemble or standalone with Chat Completions, including its unique identifier.
Properties
The full ID of the LLM to use.
For Vector Completions only, specifies the LLM's voting output mode. For "instruction", the assistant is instructed to output a key. For "json_schema", the assistant is constrained to output a valid key using a JSON schema. For "tool_call", the assistant is instructed to output a tool call to select the key.
Variants
For Vector Completions only, whether to use synthetic reasoning prior to voting. Works for any LLM, even those that do not have native reasoning capabilities.
For Vector Completions only, whether to use logprobs to make the vote probabilistic. This means that the LLM can vote for multiple keys based on their logprobabilities. Allows LLMs to express native uncertainty when voting.
A list of messages exchanged in a chat conversation. These will be prepended to every prompt sent to this LLM. Useful for setting context or influencing behavior.
Items
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by an end user, containing prompts or additional context information.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by tools in response to tool calls made by the assistant.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
The ID of the tool call that this message is responding to.
Messages sent by the model in response to user messages.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The refusal message by the assistant.
Tool calls made by the assistant.
Items
A function tool call made by the assistant.
Properties
The unique identifier for the tool call.
The name and arguments of the function called.
Properties
The name of the function called.
The arguments passed to the function.
The reasoning provided by the assistant.
A list of messages exchanged in a chat conversation. These will be appended to every prompt sent to this LLM. Useful for setting context or influencing behavior.
Items
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Properties
Simple content.
Variants
Plain text content.
An array of simple content parts.
Items
A simple content part.
Properties
The text content.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by an end user, containing prompts or additional context information.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Messages sent by tools in response to tool calls made by the assistant.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
The ID of the tool call that this message is responding to.
Messages sent by the model in response to user messages.
Properties
Rich content.
Variants
Plain text content.
An array of rich content parts.
Items
A text rich content part.
Properties
The text content.
An image rich content part.
Properties
The URL of the image and its optional detail level.
Properties
Either a URL of the image or the base64 encoded image data.
Specifies the detail level of the image.
Variants
An audio rich content part.
Properties
The audio data and its format.
Properties
Base64 encoded audio data.
The format of the encoded audio data.
Variants
A video rich content part.
Properties
Variants
Properties
URL of the video.
A file rich content part.
Properties
The file to be used as input, either as base64 data, an uploaded file ID, or a URL.
Properties
The base64 encoded file data, used when passing the file to the model as a string.
The ID of an uploaded file to use as input.
The name of the file, used when passing the file to the model as a string.
The URL of the file, used when passing the file to the model as a URL.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
The refusal message by the assistant.
Tool calls made by the assistant.
Items
A function tool call made by the assistant.
Properties
The unique identifier for the tool call.
The name and arguments of the function called.
Properties
The name of the function called.
The arguments passed to the function.
The reasoning provided by the assistant.
This setting aims to control the repetition of tokens based on how often they appear in the input. It tries to use less frequently those tokens that appear more in the input, proportional to how frequently they occur. Token penalty scales with the number of occurrences. Negative values will encourage token reuse.
Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.
Values
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
This setting aims to control the presence of tokens in the output. It tries to encourage the model to use tokens that are less present in the input, proportional to their presence in the input. Token presence scales with the number of occurrences. Negative values will encourage more diverse token usage.
The assistant will stop when any of the provided strings are generated.
Variants
Generation will stop when this string is generated.
Generation will stop when any of these strings are generated.
Items
This setting influences the variety in the model’s responses. Lower values lead to more predictable and typical responses, while higher values encourage more diverse and less common responses. At 0, the model always gives the same response for a given input.
This setting limits the model’s choices to a percentage of likely tokens: only the top tokens whose probabilities add up to P. A lower value makes the model’s responses more predictable, while the default setting allows for a full range of token choices. Think of it like a dynamic Top-K.
This sets the upper limit for the number of tokens the model can generate in response. It won’t produce more than this limit. The maximum value is the context length minus the prompt length.
Represents the minimum probability for a token to be considered, relative to the probability of the most likely token. (The value changes depending on the confidence level of the most probable token.) If your Min-P is set to 0.1, that means it will only allow for tokens that are at least 1/10th as probable as the best possible option.
Options for selecting the upstream provider of this model.
Properties
Whether to allow fallback providers if the preferred provider is unavailable.
Whether to require that the provider supports all specified parameters.
An ordered list of provider names to use when selecting a provider for this model.
Items
A list of provider names to restrict selection to when selecting a provider for this model.
Items
A list of provider names to ignore when selecting a provider for this model.
Items
Specifies the quantizations to allow when selecting providers for this model.
Items
Options for controlling reasoning behavior of the model.
Properties
Enables or disables reasoning for supported models.
The maximum number of tokens to use for reasoning in a response.
Constrains effort on reasoning for supported reasoning models. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
Variants
Controls the verbosity of the reasoning summary for supported reasoning models.
Variants
Helps to reduce the repetition of tokens from the input. A higher value makes the model less likely to repeat tokens, but too high a value can make the output less coherent (often with run-on sentences that lack small words). Token penalty scales based on original token’s probability.
Consider only the top tokens with “sufficiently high” probabilities based on the probability of the most likely token. Think of it like a dynamic Top-P. A lower Top-A value focuses the choices based on the highest probability token but with a narrower scope. A higher Top-A value does not necessarily affect the creativity of the output, but rather refines the filtering process based on the maximum probability.
This limits the model’s choice of tokens at each step, making it choose from a smaller set. A value of 1 means the model will always pick the most likely next token, leading to predictable results. By default this setting is disabled, making the model to consider all choices.
Controls the verbosity and length of the model response. Lower values produce more concise responses, while higher values produce more detailed and comprehensive responses.
Variants
The unique identifier for the Ensemble LLM.
The Unix timestamp (in seconds) when the Ensemble was created.