Skip to main content

Text-to-Speech

Use text-to-speech to generate audio from text input.

POST /v1/audio/speech

Request

curl https://api.voxvey.com/v1/audio/speech \
-H "Authorization: Bearer $VOXVEY_TOKEN" \
-H "Content-Type: application/json" \
-o speech.mp3 \
-d '{
"model": "openai/gpt-4o-mini-tts",
"input": "Welcome to Voxvey.",
"voice": "alloy"
}'

Required fields

FieldTypeNotes
modelstringProvider-prefixed speech model ID
inputstringText to render as speech

Notes

  • OpenAI-compatible speech models return audio bytes.
  • xai/grok-tts is supported for xAI text-to-speech.
  • Gemini TTS models are adapted by the gateway when available.