Chat
Vision
Use chat + vision models (image-as-input
) to generate text-based responses.
Describe Images
Send text and image inputs to vision-enabled chat models to generate descriptions, compare images, and analyze visual content.
Quickstart
Describe an Image
Send an image with a text prompt to generate a description.
Compare Two Images
Ask the model to compare multiple images.
Streaming
Get real-time responses when analyzing an image.
javascript
Node.js Version (Using Readable Stream)
javascript
Browser Version (Using getReader()
)
javascript
Key Points
Node.js
: Convert the Web Stream usingReadable.fromWeb()
for compatibility.Browser
: UsegetReader()
andTextDecoder
to process the stream.Error Handling
: Both methods use try…catch to handle potential errors.Data Handling
: Data chunks are processed as they arrive via data events or.read()
calls.
Explore Specialized Models
Object Detection
: Models trained to detect objects in images.Fill Mask
: Models designed to fill in missing parts of an image.Image Classification
: Models optimized for classifying images into categories.Image-to-Text
: Models that generate textual descriptions for images.