API ReferenceText-to-Speech ModelsGoogle Gemini
Gemini 2.5 Pro
Google Gemini 2.5 Pro TTS — highest fidelity with advanced style instructions.
The highest-fidelity Gemini voice — richer emotional range and stronger adherence to style instructions. Best for premium narration and audiobooks where nuance matters.
| Endpoint | POST /api/v1/tts/synthesize |
Model ID (ttsModel) | gemini-2-5-pro |
| Provider | Google Gemini (paid access) |
| Character Limit | 3,000 |
| Cost | 1,050 / 1K chars |
Shared conventions
Auth, the voice→provider rule, rate limits, and the full error table live on the Speech API page. Below is only what's specific to Gemini 2.5 Pro.
Model-specific notes
- Follows
style_instructionsmore precisely than Flash — ideal for directed, audiobook-style delivery. - Gemini requests pass through a fair-use queue (Pro/Max get priority); when it's full you get
503 SERVER_BUSY.
When `style_instructions` applies
Custom instructions are used only when both style_instructions_enabled
is true and the instruction is at least 30 characters. Shorter
instructions are ignored.
Request body
| Parameter | Type | Required | Description |
|---|---|---|---|
text | string | Yes | Text to synthesize (max 3,000 chars) |
voice | string | Yes | A Gemini voice ID from GET /api/v1/tts/voices |
ttsModel | string | Yes | Must be "gemini-2-5-pro" |
style_instructions | string | No | Free-form delivery instruction (≥ 30 chars to take effect) |
style_instructions_enabled | boolean | No | Activate style_instructions · Default: false |
speed | number | No | Speaking rate · Default: 1.0 |
title | string | No | Title for the saved project |
{
"text": "Chapter one. The world had changed — and nobody noticed at first.",
"voice": "voice-uuid-gemini-pro",
"ttsModel": "gemini-2-5-pro",
"style_instructions": "Narrate like a classic audiobook — calm, measured, with subtle dramatic pauses.",
"style_instructions_enabled": true
}Response (200 OK)
{
"url": "https://cdn.sonnalabs.app/sonna/api-ephemeral/tts/paid/user123/mno345.mp3",
"remainingCredits": 100195,
"projectCreated": true
}Errors (400 TEXT_TOO_LONG, 403 PAID_ACCESS_REQUIRED, 409, 429, 503 SERVER_BUSY when the queue is full) follow the shared Speech API table.