Describe Images

Send text and image inputs to vision-enabled chat models to generate descriptions, compare images, and analyze visual content.

Quickstart

Describe an Image

Send an image with a text prompt to generate a description.

import Bytez from "bytez.js";

const client = new Bytez("YOUR_BYTEZ_KEY_HERE");
const model = client.model("meta-llama/Llama-3.2-11B-Vision-Instruct");

const textInput = [
  {
    role: "system",
    content: [{ type: "text", text: "You are a helpful assistant." }]
  },
  {
    role: "user",
    content: [
      { type: "text", text: "What is this image?" },
      { type: "image", url: "https://hips.hearstapps.com/hmg-prod/images/how-to-keep-ducks-call-ducks-1615457181.jpg?crop=0.670xw:1.00xh;0.157xw,0&resize=980:*" }
    ]
  }
];

const { error, output } = await model.run(textInput);

if (error) {
  console.error("Error running the model:", error);
} else {
  console.log(output);
}

Compare Two Images

Ask the model to compare multiple images.

import Bytez from "bytez.js";

const client = new Bytez("YOUR_BYTEZ_KEY_HERE");
const model = client.model("meta-llama/Llama-3.2-11B-Vision-Instruct");

const multiImageInput = [
  {
    role: "system",
    content: [{ type: "text", text: "You are a helpful assistant." }]
  },
  {
    role: "user",
    content: [
      { type: "text", text: "Compare these images." },
      { type: "image", url: "https://example.com/path-to-image1.jpg" },
      { type: "image", url: "https://example.com/path-to-image2.jpg" }
    ]
  }
];

const { error, output } = await model.run(multiImageInput);

if (error) {
  console.error("Error running the model:", error);
} else {
  console.log(output);
}

Streaming

Get real-time responses when analyzing an image.

javascript
const stream = await model.run(textInput, params, true);

Node.js Version (Using Readable Stream)

javascript
import Bytez from "bytez.js";
import { Readable } from "stream";

const client = new Bytez("YOUR_BYTEZ_KEY_HERE");
const model = client.model("meta-llama/Llama-3.2-11B-Vision-Instruct");

const multiImageInput = [
  {
    role: "system",
    content: [{ type: "text", text: "You are a helpful assistant." }]
  },
  {
    role: "user",
    content: [
      { type: "text", text: "Compare these images." },
      { type: "image", url: "https://example.com/path-to-image1.jpg" },
      { type: "image", url: "https://example.com/path-to-image2.jpg" }
    ]
  }
];

const params = { max_new_tokens: 100 };

// Stream response
const stream = await model.run(multiImageInput, params, true);

try {
  const readableStream = Readable.fromWeb(stream); // Convert Web Stream to Node.js Readable Stream
  for await (const chunk of readableStream) {
    console.log(chunk.toString()); // Log each chunk
  }
  console.log("Streaming ended.");
} catch (error) {
  console.error("Streaming error:", error);
}

Browser Version (Using `getReader()`)

javascript
import Bytez from "bytez.js";

const client = new Bytez("YOUR_BYTEZ_KEY_HERE");
const model = client.model("meta-llama/Llama-3.2-11B-Vision-Instruct");

const multiImageInput = [
  {
    role: "system",
    content: [{ type: "text", text: "You are a helpful assistant." }]
  },
  {
    role: "user",
    content: [
      { type: "text", text: "Compare these images." },
      { type: "image", url: "https://example.com/path-to-image1.jpg" },
      { type: "image", url: "https://example.com/path-to-image2.jpg" }
    ]
  }
];

const params = { max_new_tokens: 100 };

// Stream response
const stream = await model.run(multiImageInput, params, true);

try {
  const reader = stream.getReader(); // Get a reader for the Web Stream

  while (true) {
    const { done, value } = await reader.read(); // Read chunk-by-chunk
    if (done) break; // Exit when the stream ends
    console.log(new TextDecoder().decode(value)); // Convert Uint8Array to string
  }

  console.log("Streaming ended.");
} catch (error) {
  console.error("Streaming error:", error);
}

Key Points

Node.js: Convert the Web Stream using Readable.fromWeb() for compatibility.
Browser: Use getReader() and TextDecoder to process the stream.
Error Handling: Both methods use try…catch to handle potential errors.
Data Handling: Data chunks are processed as they arrive via data events or .read() calls.

Explore Specialized Models

Object Detection: Models trained to detect objects in images.
Fill Mask: Models designed to fill in missing parts of an image.
Image Classification: Models optimized for classifying images into categories.
Image-to-Text: Models that generate textual descriptions for images.

Model API

Get Started

Capabilities

Integrations

Vision

Describe Images

Quickstart

Describe an Image

Compare Two Images

Streaming

Node.js Version (Using Readable Stream)

Browser Version (Using `getReader()`)

Key Points

Explore Specialized Models

Model API

Get Started

Capabilities

Integrations

​Describe Images

​Quickstart

​Describe an Image

​Compare Two Images

​Streaming

​Node.js Version (Using Readable Stream)

​Browser Version (Using getReader())

​Key Points

​Explore Specialized Models

Describe Images

Quickstart

Describe an Image

Compare Two Images

Streaming

Node.js Version (Using Readable Stream)

Browser Version (Using `getReader()`)

Key Points

Explore Specialized Models