Multimodal
Chat + Video
API Reference
- Overview
- Endpoints
- GETModels
- GETTasks
- GETClusters
Text as Input
Image as Input
Multimodal
Chat + Video
Analyze video using the LLaVA-NeXT-Video-7B-hf model.
POST
/
models
/
v2
/
llava-hf
/
LLaVA-NeXT-Video-7B-hf
curl --request POST \
--url https://api.bytez.com/models/v2/llava-hf/LLaVA-NeXT-Video-7B-hf \
--header 'Content-Type: application/json' \
--data '{
"messages": [
{
"role": "system",
"content": [
{
"type": "text",
"text": "You are a helpful assistant."
}
]
},
{
"role": "user",
"content": [
{
"type": "text",
"text": "Why is this video funny?"
},
{
"type": "video",
"url": "https://example.com/path-to-video.mp4"
}
]
}
]
}'
{
"output": [
"<string>"
]
}
Body
application/json
Response
200 - application/json
Successful response
curl --request POST \
--url https://api.bytez.com/models/v2/llava-hf/LLaVA-NeXT-Video-7B-hf \
--header 'Content-Type: application/json' \
--data '{
"messages": [
{
"role": "system",
"content": [
{
"type": "text",
"text": "You are a helpful assistant."
}
]
},
{
"role": "user",
"content": [
{
"type": "text",
"text": "Why is this video funny?"
},
{
"type": "video",
"url": "https://example.com/path-to-video.mp4"
}
]
}
]
}'
{
"output": [
"<string>"
]
}