Google TTS StudioGoogle key readyAPI token ready

Google TTS Studio

Configure a Gemini text-to-speech request, preview the output, and reuse the same payload through the public API.

Model default: gemini-3.1-flash-tts-previewAPI: /api/tts

Request

Single or multi-speaker TTS with the same payload used by the API.

Model

Primary voice

Style prompt

Script

Uses the same validated payload as POST /api/tts.

API snippets

Use the same request shape from another project or from a terminal.

curl

curl -X POST https://your-domain.vercel.app/api/tts \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TTS_API_TOKEN" \
  -d '{"mode":"single","model":"gemini-3.1-flash-tts-preview","prompt":"Warm, polished, and slightly upbeat. Keep the pacing clear and natural.","text":"Say in a warm tone: \"Hello, welcome to the demo.\"","voiceName":"Kore","speakers":[],"responseFormat":"wav"}' \
  --output out.wav

fetch

const response = await fetch('/api/tts', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer YOUR_TTS_API_TOKEN',
  },
  body: JSON.stringify({
  "mode": "single",
  "model": "gemini-3.1-flash-tts-preview",
  "prompt": "Warm, polished, and slightly upbeat. Keep the pacing clear and natural.",
  "text": "Say in a warm tone: \"Hello, welcome to the demo.\"",
  "voiceName": "Kore",
  "speakers": [],
  "responseFormat": "wav"
}),
});

const audioBlob = await response.blob();
const url = URL.createObjectURL(audioBlob);

Binary / JSON modes

The API returns WAV by default, and JSON when you ask for it.

json mode

POST /api/tts?format=json

{
  "mode": "single",
  "model": "gemini-3.1-flash-tts-preview",
  "text": "...",
  "responseFormat": "json"
}

Set `TTS_API_TOKEN` before exposing the API publicly. Same-origin browser calls still work without extra CORS config.

Output

Preview, download, and inspect the last successful generation.

No audio yet. Generate once to see the waveform preview and the downloadable WAV output.

Payload preview

What the API will receive when you press Generate.

{
  "mode": "single",
  "model": "gemini-3.1-flash-tts-preview",
  "prompt": "Warm, polished, and slightly upbeat. Keep the pacing clear and natural.",
  "text": "Say in a warm tone: \"Hello, welcome to the demo.\"",
  "voiceName": "Kore",
  "speakers": [],
  "responseFormat": "wav"
}

Clipboard

Short-lived feedback for copy actions.

Ready

Server action + API route