Sonna
API ReferenceText-to-Speech ModelsGoogle Gemini

Gemini 2.5 Flash

Google Gemini 2.5 Flash TTS — fast, natural multilingual synthesis.

Fast, natural multilingual Gemini synthesis — a single voice speaks any language with consistent rhythm. A good balance of speed and quality for everyday narration.

EndpointPOST /api/v1/tts/synthesize
Model ID (ttsModel)gemini-2-5-flash
ProviderGoogle Gemini (paid access)
Character Limit3,000
Cost700 / 1K chars

Shared conventions

Auth, the voice→provider rule, rate limits, and the full error table live on the Speech API page. Below is only what's specific to Gemini 2.5 Flash.

Model-specific notes

  • Steer delivery with style_instructions (free-form, e.g. "warm, encouraging tone").
  • Gemini requests pass through a fair-use queue (Pro/Max get priority); when it's full you get 503 SERVER_BUSY.

When `style_instructions` applies

Custom instructions are used only when both style_instructions_enabled is true and the instruction is at least 30 characters. Shorter instructions are ignored and the voice's default delivery is used.

Request body

ParameterTypeRequiredDescription
textstringYesText to synthesize (max 3,000 chars)
voicestringYesA Gemini voice ID from GET /api/v1/tts/voices
ttsModelstringYesMust be "gemini-2-5-flash"
style_instructionsstringNoFree-form delivery instruction (≥ 30 chars to take effect)
style_instructions_enabledbooleanNoActivate style_instructions · Default: false
speednumberNoSpeaking rate · Default: 1.0
titlestringNoTitle for the saved project
{
  "text": "Your audio summary is ready. Tap to listen.",
  "voice": "voice-uuid-gemini",
  "ttsModel": "gemini-2-5-flash"
}

Response (200 OK)

{
  "url": "https://cdn.sonnalabs.app/sonna/api-ephemeral/tts/paid/user123/jkl012.mp3",
  "remainingCredits": 101230,
  "projectCreated": true
}

Errors (400 TEXT_TOO_LONG, 403 PAID_ACCESS_REQUIRED, 409, 429, 503 SERVER_BUSY when the queue is full) follow the shared Speech API table.

On this page