Sends a prompt to an OpenAI compatible completion model and returns a completion. Provides completions for all open source models that are text-generation, chat, audio-text-to-text, image-text-to-text, video-text-to-text, it also supports the closed source providers openai, anthropic, mistral, cohere, and google. To send a request to a closed source provider, prefix your model with their provider name, e.g. openai/gpt-4
Token for authentication
Optional provider key for running requests against closed source models. Required for providers anthropic and cohere. Removes rate limits for the other providers.
The ID of the model to run (e.g., Qwen/Qwen3-1.7B, openai/gpt-4)
Conversation messages (OpenAI chat format)
Maximum number of tokens to generate
Sampling temperature
Whether to stream responses