Gemini 2.5 Flash
Google Gemini 2.5 Flash TTS — fast, natural multilingual synthesis.
Fast, natural multilingual Gemini synthesis — a single voice speaks any language with consistent rhythm. A good balance of speed and quality for everyday narration.
| Endpoint | POST /api/v1/tts/synthesize |
Model ID (ttsModel) | gemini-2-5-flash |
| Provider | Google Gemini (paid access) |
| Character Limit | 3,000 |
| Cost | 700 / 1K chars |
Shared conventions
Auth, the voice→provider rule, rate limits, and the full error table live on the Speech API page. Below is only what's specific to Gemini 2.5 Flash.
Model-specific notes
- Steer delivery with
style_instructions(free-form, e.g. "warm, encouraging tone"). - Gemini requests pass through a fair-use queue (Pro/Max get priority); when it's full you get
503 SERVER_BUSY.
When `style_instructions` applies
Custom instructions are used only when both style_instructions_enabled
is true and the instruction is at least 30 characters. Shorter
instructions are ignored and the voice's default delivery is used.
Request body
| Parameter | Type | Required | Description |
|---|---|---|---|
text | string | Yes | Text to synthesize (max 3,000 chars) |
voice | string | Yes | A Gemini voice ID from GET /api/v1/tts/voices |
ttsModel | string | Yes | Must be "gemini-2-5-flash" |
style_instructions | string | No | Free-form delivery instruction (≥ 30 chars to take effect) |
style_instructions_enabled | boolean | No | Activate style_instructions · Default: false |
speed | number | No | Speaking rate · Default: 1.0 |
title | string | No | Title for the saved project |
{
"text": "Your audio summary is ready. Tap to listen.",
"voice": "voice-uuid-gemini",
"ttsModel": "gemini-2-5-flash"
}Response (200 OK)
{
"url": "https://cdn.sonnalabs.app/sonna/api-ephemeral/tts/paid/user123/jkl012.mp3",
"remainingCredits": 101230,
"projectCreated": true
}Errors (400 TEXT_TOO_LONG, 403 PAID_ACCESS_REQUIRED, 409, 429, 503 SERVER_BUSY when the queue is full) follow the shared Speech API table.