Multi-input
Visual Question Answering
Answer a question based on an image using the Vilt_fine_tune_2000
model.
POST
Authorizations
Provide your API key as Key your-key-here
in the Authorization
header.
Body
application/json
URL of the image.
Example:
"https://ocean.si.edu/sites/default/files/styles/3_2_largest/public/2023-11/Screen_Shot_2018-04-16_at_1_42_56_PM.png.webp?itok=Icvi-ek9"
The question to answer.
Example:
"What kind of animal is this?"
Base64-encoded image data.
Example:
"/9j/4AAQSkZJRgABAQAAAQABAAD..."
Response
200 - application/json
Successful response with the answer.
The answer to the question.