Sonna
API ReferenceText-to-Speech ModelsGoogle Gemini

Gemini 2.5 Pro

Google Gemini 2.5 Pro TTS — highest fidelity with advanced style instructions.

The highest-fidelity Gemini voice — richer emotional range and stronger adherence to style instructions. Best for premium narration and audiobooks where nuance matters.

EndpointPOST /api/v1/tts/synthesize
Model ID (ttsModel)gemini-2-5-pro
ProviderGoogle Gemini (paid access)
Character Limit3,000
Cost1,050 / 1K chars

Shared conventions

Auth, the voice→provider rule, rate limits, and the full error table live on the Speech API page. Below is only what's specific to Gemini 2.5 Pro.

Model-specific notes

  • Follows style_instructions more precisely than Flash — ideal for directed, audiobook-style delivery.
  • Gemini requests pass through a fair-use queue (Pro/Max get priority); when it's full you get 503 SERVER_BUSY.

When `style_instructions` applies

Custom instructions are used only when both style_instructions_enabled is true and the instruction is at least 30 characters. Shorter instructions are ignored.

Request body

ParameterTypeRequiredDescription
textstringYesText to synthesize (max 3,000 chars)
voicestringYesA Gemini voice ID from GET /api/v1/tts/voices
ttsModelstringYesMust be "gemini-2-5-pro"
style_instructionsstringNoFree-form delivery instruction (≥ 30 chars to take effect)
style_instructions_enabledbooleanNoActivate style_instructions · Default: false
speednumberNoSpeaking rate · Default: 1.0
titlestringNoTitle for the saved project
{
  "text": "Chapter one. The world had changed — and nobody noticed at first.",
  "voice": "voice-uuid-gemini-pro",
  "ttsModel": "gemini-2-5-pro",
  "style_instructions": "Narrate like a classic audiobook — calm, measured, with subtle dramatic pauses.",
  "style_instructions_enabled": true
}

Response (200 OK)

{
  "url": "https://cdn.sonnalabs.app/sonna/api-ephemeral/tts/paid/user123/mno345.mp3",
  "remainingCredits": 100195,
  "projectCreated": true
}

Errors (400 TEXT_TOO_LONG, 403 PAID_ACCESS_REQUIRED, 409, 429, 503 SERVER_BUSY when the queue is full) follow the shared Speech API table.

On this page