Sonna
API ReferenceText-to-Speech ModelsElevenLabs

Enhance Text (Audio Tags)

Auto-insert Eleven v3 audio tags into your text with an LLM, before synthesis.

Enhance rewrites your text to add Eleven v3 audio tags ([laughs], [whispers], [sighs], …) so narration sounds more expressive. It does not produce audio — it returns enhanced text that you then send to Synthesize with ttsModel: "eleven-v3".

What it is (and isn't)

Enhance is powered by Gemini Flash (Sonna's own implementation — there is no ElevenLabs "enhance" API). It only inserts tags; it never adds, removes, or changes your words. It is free (no credits) and synchronous.

Typical flow: /enhance → take the returned text → /synthesize with eleven-v3.


Enhance a single text

Endpoint

POST https://api.sonnalabs.app/api/v1/tts/enhance

Request

POST https://api.sonnalabs.app/api/v1/tts/enhance
Authorization: Bearer sona_sk_your_api_key_here
Content-Type: application/json
ParameterTypeRequiredDescription
textstringYesText to enhance (max 2,000 chars).
voiceIdstringNoA Sonna voice ID — used so the LLM picks tags that fit the voice's traits.
{
  "text": "Are you serious? I can't believe you did that!",
  "voiceId": "db4e815d-00aa-43e6-99cf-0d9b4db9a07a"
}

Response (200 OK)

{
  "text": "[appalled] Are you serious? [sighs] I can't believe you did that!",
  "original": "Are you serious? I can't believe you did that!"
}

Fail-soft

If enhancement fails, the endpoint still returns 200 with the original text and "fallback": true, so your pipeline never breaks. Send the returned text to synthesis either way.


Enhance a multi-speaker dialogue

Enhances every turn in one LLM round-trip so tags use conversational context. Pair this with Multi-Speaker Dialogue.

Endpoint

POST https://api.sonnalabs.app/api/v1/tts/enhance-dialogue

ParameterTypeRequiredDescription
turnsobject[]YesArray of { id, text, voiceId? }. Total text across all turns ≤ 2,000 chars. id is echoed back so you can match turns.
{
  "turns": [
    {
      "id": "1",
      "text": "Have you tried the new model?",
      "voiceId": "voice-a"
    },
    {
      "id": "2",
      "text": "Just got it! The clarity is amazing.",
      "voiceId": "voice-b"
    }
  ]
}

Response (200 OK)

{
  "turns": [
    { "id": "1", "text": "[excited] Have you tried the new model?" },
    { "id": "2", "text": "[amazed] Just got it! The clarity is amazing." }
  ]
}

Errors

StatusCodeReason
400text (or turns) missing/empty
400TEXT_TOO_LONGText exceeds the 2,000-character Enhance limit
401Missing or invalid API key
429Rate limit exceeded (shares the 30/min speech limit)

No credits charged

Enhance does not deduct credits. Only the subsequent synthesis call costs credits.

On this page