Gemini 2.5 Flash

Fast, natural multilingual Gemini synthesis — a single voice speaks any language with consistent rhythm. A good balance of speed and quality for everyday narration.


Endpoint	`POST /api/v1/tts/synthesize`
Model ID (`ttsModel`)	`gemini-2-5-flash`
Provider	Google Gemini (paid access)
Character Limit	3,000
Cost	700 / 1K chars

Shared conventions

Auth, the voice→provider rule, rate limits, and the full error table live on the Speech API page. Below is only what's specific to Gemini 2.5 Flash.

Model-specific notes

Steer delivery with style_instructions (free-form, e.g. "warm, encouraging tone").
Gemini requests pass through a fair-use queue (Pro/Max get priority); when it's full you get 503 SERVER_BUSY.

When `style_instructions` applies

Custom instructions are used only when both style_instructions_enabled is true and the instruction is at least 30 characters. Shorter instructions are ignored and the voice's default delivery is used.

Request body

Parameter	Type	Required	Description
`text`	string	Yes	Text to synthesize (max 3,000 chars)
`voice`	string	Yes	A Gemini voice ID from GET /api/v1/tts/voices
`ttsModel`	string	Yes	Must be `"gemini-2-5-flash"`
`style_instructions`	string	No	Free-form delivery instruction (≥ 30 chars to take effect)
`style_instructions_enabled`	boolean	No	Activate `style_instructions` · Default: `false`
`speed`	number	No	Speaking rate · Default: `1.0`
`title`	string	No	Title for the saved project

{
  "text": "Your audio summary is ready. Tap to listen.",
  "voice": "voice-uuid-gemini",
  "ttsModel": "gemini-2-5-flash"
}

Response (200 OK)

{
  "url": "https://cdn.sonnalabs.app/sonna/api-ephemeral/tts/paid/user123/jkl012.mp3",
  "remainingCredits": 101230,
  "projectCreated": true
}

Errors (400 TEXT_TOO_LONG, 403 PAID_ACCESS_REQUIRED, 409, 429, 503 SERVER_BUSY when the queue is full) follow the shared Speech API table.

Model-specific notes

Request body

Response (200 OK)

On this page