Sonna
API ReferenceText-to-Speech Models

Speech API

Shared conventions for Text-to-Speech: endpoint, auth, voice selection, rate limits, and errors.

All Text-to-Speech models share one endpoint and the same conventions. This page documents them once; each model page then covers only what is specific to that model (limits, cost, and its own parameters).

Endpoint

POST https://api.sonnalabs.app/api/v1/tts/synthesize
Authorization: Bearer sona_sk_your_api_key_here
Content-Type: application/json

You select the model with the ttsModel field and the voice with voice. Synthesis is synchronous — the response returns the finished audio URL directly (no job_id, no polling).

{
  "url": "https://cdn.sonnalabs.app/sonna/api-ephemeral/tts/paid/user123/abc123.mp3",
  "remainingCredits": 99580,
  "projectCreated": true
}

Output is temporary — download it

Files generated through the API are stored on a temporary prefix (sonna/api-ephemeral/…) and are automatically deleted after 24 hours. The API is a generation service, not file hosting — download the url and store it on your own infrastructure. Files created in the Sonna app/web stay in your Library; API output does not.

The voice determines the provider

The provider (ElevenLabs / Gemini / Google) is resolved from the voice, not from ttsModel. Pick a voice whose provider matches your chosen model — list them with GET /api/v1/tts/voices. If a voice from another provider is sent, that provider serves the request and ttsModel is ignored.

Access

Who can use which provider

Google Cloud voices (Neural2 / Wavenet) work on every plan, including Free. ElevenLabs and Gemini require an active Pro/Max subscription or PAYG credits — Free-tier requests for them return 403 PAID_ACCESS_REQUIRED.

Rate limits & concurrency

  • 30 requests/minute per user.
  • One synthesis at a time per account — a second concurrent call returns 409 DUPLICATE_REQUEST.
  • Credits are auto-refunded on any failure.
  • Gemini requests also pass through a fair-use queue (Pro/Max get priority); when it's full you get 503 SERVER_BUSY.

See Rate Limits for details.

Errors

StatusCodeReason
400TEXT_TOO_LONGText exceeds the model's limit (or your plan cap)
400text or voice missing, or the voice ID is invalid
402Insufficient credits
403PAID_ACCESS_REQUIREDFree-tier account using an ElevenLabs or Gemini voice
409DUPLICATE_REQUESTAnother synthesis is already in progress for your account
429Rate limit exceeded
503SERVER_BUSYGemini synthesis queue is full — retry shortly
503PROVIDER_BUSYProvider temporarily over capacity — retry after Retry-After

Billing

Authenticating with an API key applies a 10% credit discount on speech. Speech is billed per character (pro-rated). Per-model rates are on each model page and in Credits & Pricing.

Models

ProviderModelsAccess
ElevenLabsEleven v3, Multilingual v2, Flash v2.5Paid
Google Gemini2.5 Flash, 2.5 ProPaid
Google CloudNeural2, WavenetFree + Paid

Also available: Multi-Speaker Dialogue and Enhance Text.

On this page