Text to Speech

AI text-to-speech: turn text into natural AI voiceovers with ElevenLabs, Google Cloud, and Gemini voices in 60+ languages.

Sonna's Text to Speech (TTS) studio lets you convert scripts, articles, and book chapters into high-quality spoken audio using natural-sounding AI voices.

Supported Narration Models

We integrate top-tier speech synthesis engines to provide a range of languages, latency profiles, and accents:

Provider	Model Name	Credit Cost (per 1K chars)	Max Script Length	Key Features
ElevenLabs	`Eleven v3`	2,100 / 1K chars	5,000 chars	Maximum expression and emotional range.
ElevenLabs	`Multilingual v2`	2,100 / 1K chars	10,000 chars	Exceptional translation pronunciation in 29+ languages.
ElevenLabs	`Flash v2.5`	1,050 / 1K chars	40,000 chars	Ultra-fast generation, ideal for long scripts.
Google Cloud	`Neural2`	500 / 1K chars	3,000 chars	Standard narration, clean and clear.
Google Cloud	`Wavenet`	500 / 1K chars	3,000 chars	Balanced narration quality.
Gemini	`2.5 Flash`	700 / 1K chars	3,000 chars	Fast, conversational speech tone.
Gemini	`2.5 Pro`	1,050 / 1K chars	3,000 chars	Context-aware, natural emphasis.

Rates are shown per 1,000 characters for readability, but billing is per character (pro-rated). A 500-character script on Eleven v3 costs 1,050 credits, not 2,100. Minimum charge is 1 character.

Character Limits & Free Tier Restrictions

To protect backend resources, character limits per request are restricted based on your plan:

Free Plan Users: Constrained to a maximum of 1,000 characters per request, regardless of the model chosen.
Pro & Max Plan Users: Can generate up to the maximum character limits listed in the model table above (e.g., up to 40,000 characters using Flash v2.5).

Browse Sonna's curated library of ready-to-use voices, filterable by language, accent, gender, and optimal use case (e.g. narrator, storytelling, energetic, professional). The library spans all three providers:

ElevenLabs — the most expressive voices, including premium professional voices.
Google Cloud — Neural2 and WaveNet, clean and reliable across many languages.
Gemini — natural, conversational multilingual voices.

Pick a voice from the dropdown in the Text to Speech studio, then choose a model that fits (see the table above). Save the ones you use most to your favorites.

Supported Narration Models

Character Limits & Free Tier Restrictions

Voice Library

On this page