Chat Completions

curl --request POST \
  --url https://api.bytez.com/models/v2/openai/v1/chat/completions \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "messages": [
    {
      "role": "system",
      "content": "<string>"
    }
  ],
  "max_completion_tokens": 256,
  "temperature": 0.7,
  "stream": false,
  "top_p": 123,
  "presence_penalty": 123,
  "frequency_penalty": 123,
  "logprobs": true,
  "top_logprobs": 123
}
'

{
  "id": "<string>",
  "object": "<string>",
  "created": 123,
  "choices": [
    {
      "index": 123,
      "message": {
        "role": "<string>",
        "content": "<string>"
      },
      "finish_reason": "<string>"
    }
  ]
}

POST

models

openai

chat

completions

Chat Completions

curl --request POST \
  --url https://api.bytez.com/models/v2/openai/v1/chat/completions \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "messages": [
    {
      "role": "system",
      "content": "<string>"
    }
  ],
  "max_completion_tokens": 256,
  "temperature": 0.7,
  "stream": false,
  "top_p": 123,
  "presence_penalty": 123,
  "frequency_penalty": 123,
  "logprobs": true,
  "top_logprobs": 123
}
'

{
  "id": "<string>",
  "object": "<string>",
  "created": 123,
  "choices": [
    {
      "index": 123,
      "message": {
        "role": "<string>",
        "content": "<string>"
      },
      "finish_reason": "<string>"
    }
  ]
}

Headers

Authorization

string

required

Token for authentication

Body

application/json

model

string

required

The ID of the model to run (e.g., Qwen/Qwen3-1.7B, openai/gpt-4)

messages

object[]

required

Conversation messages (OpenAI chat format)

Show child attributes

max_completion_tokens

integer

default:256

Maximum number of tokens to generate

temperature

number

default:0.7

Sampling temperature

stream

boolean

default:false

Whether to stream responses

top_p

number

Nucleus sampling parameter

presence_penalty

number

Penalize new tokens based on whether they appear in the text so far

frequency_penalty

number

Penalize new tokens based on their existing frequency in the text so far

logprobs

boolean

Whether to return log probabilities of output tokens (if supported)

top_logprobs

integer

Number of most likely tokens to return at each position (if logprobs is true)

Response

Successful model completion

string

Unique ID for this completion

object

string

Type of returned object (usually chat.completion)

created

integer

Unix timestamp of completion

choices

object[]

Generated completions

Show child attributes

Run Model Completions

Endpoints

Examples

Chat Completions

Headers

Body

Response