Chat
Vision
Use chat models with images as input to generate text-based responses.
Chat + Vision (Image as Input
)
The Bytez API enables multimodal capabilities, allowing chat models to process both text
and images
. This allows models to describe, compare, and analyze images alongside user queries.
Explore vision-based chat models by providing
image
inputs along with text
prompts. Below are examples using both the REST API and JavaScript SDK.Code
Text + Image
This example sends an image with a text prompt to generate a description.
Text + Multiple Images
Compare and analyze multiple images using a single query.
Streaming
Streaming allows you to receive model outputs incrementally as soon as they are available, which is ideal for tasks like real-time responses or large outputs.
How Streaming Works
To enable streaming, pass true
as the third argument to the model.run()
function. The model will return a stream that you can read incrementally.
javascript
Node.js Example
javascript
Browser Example
javascript
Key Points
Node.js
: Convert the Web Stream usingReadable.fromWeb()
for compatibility.Browser
: UsegetReader()
andTextDecoder
to process the stream.Error Handling
: Both methods use try…catch to handle potential errors.Data Handling
: Data chunks are processed as they arrive via data events or.read()
calls.
Explore Specialized Models
Object Detection
: Models trained to detect objects in images.Fill Mask
: Models designed to fill in missing parts of an image.Image Classification
: Models optimized for classifying images into categories.Image-to-Text
: Models that generate textual descriptions for images.