Bytez home pagelight logodark logo
  • Discord
  • Get support
  • Bytez-com/docs
  • Bytez-com/docs
Tasks
Automatic Speech Recognition
Docs
HTTP Reference
Integrations
API status
Model API
  • Welcome
  • Get started
  • Understand the API
  • Tasks
    • Audio Classification
    • Automatic Speech Recognition
    • Chat
    • Depth Estimation
    • Document Question Answering
    • Feature Extraction
    • Fill Mask
    • Image Classification
    • Image Feature Extraction
    • Image Segmentation
    • Image-to-Text
    • Mask Generation
    • Object Detection
    • Question Answering
    • Sentence Similarity
    • Summarization
    • Text Classification
    • Text Generation
    • Image Generation
    • Text-to-Speech
    • Text-to-Text Generation
    • Text-to-Video
    • Token Classification
    • Translation
    • Unconditional Image Generation
    • Video Classification
    • Visual Question Answering
    • Zero Shot Classification
    • Zero Shot Image Classification
    • Zero Shot Object Detection
Tasks

Automatic Speech Recognition

Convert spoken language into written text for transcription services, voice assistants, and accessibility features.

Send an audio file to an ASR model to generate text.

import Bytez from 'bytez.js'

const sdk = new Bytez("BYTEZ_KEY");
const model = sdk.model("facebook/data2vec-audio-base-960h");

await model.create()

const audio = "https://huggingface.co/datasets/huggingfacejs/tasks/resolve/main/automatic-speech-recognition/input.flac"
const { error, output } = await model.run({ url: audio });

console.log({ error, output })

You can send the via url or base64 data URL.

We recommend url for better performance, as base64 increases payload size.

Audio ClassificationChat
xgithublinkedindiscord
Powered by Mintlify