Chat + Audio (Audio as Input)

The Bytez API allows chat models to process audio files alongside text, enabling tasks like transcription, voice command recognition, and conversational AI with spoken inputs.

Use chat models to analyze audio files, transcribe speech, or respond to sound-based queries. Below are examples using both the REST API and JavaScript SDK.

Code

Text + Audio

This example sends an audio file along with a text prompt for analysis.

Streaming

Streaming allows you to receive model outputs incrementally as soon as they are available, which is ideal for tasks like real-time responses or large outputs.

How Streaming Works

To enable streaming, pass true as the third argument to the model.run() function. The model will return a stream that you can read incrementally.

javascript
const stream = await model.run(textInput, params, true);

Node.js Example

javascript
const { Readable } = require('stream');

const stream = await model.run(textInput, params, true);

try {
  const readableStream = Readable.fromWeb(stream); // Convert Web Stream to Node.js Readable Stream
  for await (const chunk of readableStream) {
    console.log(chunk.toString()); // Handle each chunk of data
  }
} catch (error) {
  console.error(error); // Handle errors
}

Browser Example

javascript
const stream = await model.run(textInput, params, true);

try {
  const reader = stream.getReader(); // Get a reader for the Web Stream

  while (true) {
    const { done, value } = await reader.read(); // Read the stream chunk-by-chunk
    if (done) break; // Exit when the stream ends
    console.log(new TextDecoder().decode(value)); // Convert Uint8Array to string
  }
} catch (error) {
  console.error(error); // Handle errors
}

Key Points

  • Node.js: Convert the Web Stream using Readable.fromWeb() for compatibility.
  • Browser: Use getReader() and TextDecoder to process the stream.
  • Error Handling: Both methods use try…catch to handle potential errors.
  • Data Handling: Data chunks are processed as they arrive via data events or .read() calls.

Explore Specialized Models

You might also be interested in pretrained models for tasks like: