Overview

How the Sonna Text-to-Speech API is organized — the endpoint, voices, and conventions.

The Sonna API is a Text-to-Speech service. You send text and a voice, and it returns a finished audio file synchronously — no jobs, no polling.

Entry point	Produces	Mode
`POST /api/v1/tts/synthesize`	Speech (Text-to-Speech)	Synchronous — returns the audio URL directly

Base URL for every request:

https://api.sonnalabs.app

App vs API

Image, Video, and Music generation are available inside the Sonna app/web, but are not exposed through the public developer API. The API surface is Text-to-Speech only.

The endpoint

Speech — `POST /api/v1/tts/synthesize`

Single-speaker synthesis. Pick a voice + a ttsModel. Two variants share the same auth and conventions:

POST /api/v1/tts/synthesize-dialogue — multi-speaker dialogue (ElevenLabs v3).
POST /api/v1/tts/enhance — auto-insert expressive audio tags into your text.

The provider (ElevenLabs / Gemini / Google) is resolved from the voice, not the model. See Text-to-Speech Models for per-model details and the shared Speech API page for conventions.

Supporting endpoints

Endpoint	Purpose	Section
`GET /api/v1/tts/voices`	List available voices	Models & Voices
`GET /api/v1/tts/models`	List Text-to-Speech models	Models & Voices
`GET /api/v1/user/credits`	Credit balance & plan	Models & Voices

Authentication

Every request needs a developer API key (sona_sk_...):

Authorization: Bearer sona_sk_your_api_key_here

API keys are available to paying users and apply a 10% credit discount on Speech. See Authentication.

Getting Started

Base URL, auth, versioning, and response conventions.

Speech API

Shared conventions: voice selection, rate limits, errors, billing.