Describe Images

Send text and image inputs to vision-enabled chat models to generate descriptions, compare images, and analyze visual content.

Quickstart

Describe an Image

Send an image with a text prompt to generate a description.

import Bytez from "bytez.js";

const client = new Bytez("YOUR_BYTEZ_KEY_HERE");
const model = client.model("meta-llama/Llama-3.2-11B-Vision-Instruct");

const textInput = [
  {
    role: "system",
    content: [{ type: "text", text: "You are a helpful assistant." }]
  },
  {
    role: "user",
    content: [
      { type: "text", text: "What is this image?" },
      { type: "image", url: "https://hips.hearstapps.com/hmg-prod/images/how-to-keep-ducks-call-ducks-1615457181.jpg?crop=0.670xw:1.00xh;0.157xw,0&resize=980:*" }
    ]
  }
];

const { error, output } = await model.run(textInput);

if (error) {
  console.error("Error running the model:", error);
} else {
  console.log(output);
}


Compare Two Images

Ask the model to compare multiple images.

import Bytez from "bytez.js";

const client = new Bytez("YOUR_BYTEZ_KEY_HERE");
const model = client.model("meta-llama/Llama-3.2-11B-Vision-Instruct");

const multiImageInput = [
  {
    role: "system",
    content: [{ type: "text", text: "You are a helpful assistant." }]
  },
  {
    role: "user",
    content: [
      { type: "text", text: "Compare these images." },
      { type: "image", url: "https://example.com/path-to-image1.jpg" },
      { type: "image", url: "https://example.com/path-to-image2.jpg" }
    ]
  }
];

const { error, output } = await model.run(multiImageInput);

if (error) {
  console.error("Error running the model:", error);
} else {
  console.log(output);
}

Streaming

Get real-time responses when analyzing an image.

javascript
const stream = await model.run(textInput, params, true);

Node.js Version (Using Readable Stream)

javascript
import Bytez from "bytez.js";
import { Readable } from "stream";

const client = new Bytez("YOUR_BYTEZ_KEY_HERE");
const model = client.model("meta-llama/Llama-3.2-11B-Vision-Instruct");

const multiImageInput = [
  {
    role: "system",
    content: [{ type: "text", text: "You are a helpful assistant." }]
  },
  {
    role: "user",
    content: [
      { type: "text", text: "Compare these images." },
      { type: "image", url: "https://example.com/path-to-image1.jpg" },
      { type: "image", url: "https://example.com/path-to-image2.jpg" }
    ]
  }
];

const params = { max_new_tokens: 100 };

// Stream response
const stream = await model.run(multiImageInput, params, true);

try {
  const readableStream = Readable.fromWeb(stream); // Convert Web Stream to Node.js Readable Stream
  for await (const chunk of readableStream) {
    console.log(chunk.toString()); // Log each chunk
  }
  console.log("Streaming ended.");
} catch (error) {
  console.error("Streaming error:", error);
}

Browser Version (Using getReader())

javascript
import Bytez from "bytez.js";

const client = new Bytez("YOUR_BYTEZ_KEY_HERE");
const model = client.model("meta-llama/Llama-3.2-11B-Vision-Instruct");

const multiImageInput = [
  {
    role: "system",
    content: [{ type: "text", text: "You are a helpful assistant." }]
  },
  {
    role: "user",
    content: [
      { type: "text", text: "Compare these images." },
      { type: "image", url: "https://example.com/path-to-image1.jpg" },
      { type: "image", url: "https://example.com/path-to-image2.jpg" }
    ]
  }
];

const params = { max_new_tokens: 100 };

// Stream response
const stream = await model.run(multiImageInput, params, true);

try {
  const reader = stream.getReader(); // Get a reader for the Web Stream

  while (true) {
    const { done, value } = await reader.read(); // Read chunk-by-chunk
    if (done) break; // Exit when the stream ends
    console.log(new TextDecoder().decode(value)); // Convert Uint8Array to string
  }

  console.log("Streaming ended.");
} catch (error) {
  console.error("Streaming error:", error);
}

Key Points

  • Node.js: Convert the Web Stream using Readable.fromWeb() for compatibility.
  • Browser: Use getReader() and TextDecoder to process the stream.
  • Error Handling: Both methods use try…catch to handle potential errors.
  • Data Handling: Data chunks are processed as they arrive via data events or .read() calls.

Explore Specialized Models