Enhance Text (Audio Tags)

Enhance rewrites your text to add Eleven v3 audio tags ([laughs], [whispers], [sighs], …) so narration sounds more expressive. It does not produce audio — it returns enhanced text that you then send to Synthesize with ttsModel: "eleven-v3".

What it is (and isn't)

Enhance is powered by Gemini Flash (Sonna's own implementation — there is no ElevenLabs "enhance" API). It only inserts tags; it never adds, removes, or changes your words. It is free (no credits) and synchronous.

Typical flow: /enhance → take the returned text → /synthesize with eleven-v3.

Enhance a single text

Endpoint

POST https://api.sonnalabs.app/api/v1/tts/enhance

Request

POST https://api.sonnalabs.app/api/v1/tts/enhance
Authorization: Bearer sona_sk_your_api_key_here
Content-Type: application/json

Parameter	Type	Required	Description
`text`	string	Yes	Text to enhance (max 2,000 chars).
`voiceId`	string	No	A Sonna voice ID — used so the LLM picks tags that fit the voice's traits.

{
  "text": "Are you serious? I can't believe you did that!",
  "voiceId": "db4e815d-00aa-43e6-99cf-0d9b4db9a07a"
}

Response (200 OK)

{
  "text": "[appalled] Are you serious? [sighs] I can't believe you did that!",
  "original": "Are you serious? I can't believe you did that!"
}

Fail-soft

If enhancement fails, the endpoint still returns 200 with the original text and "fallback": true, so your pipeline never breaks. Send the returned text to synthesis either way.

Enhance a multi-speaker dialogue

Enhances every turn in one LLM round-trip so tags use conversational context. Pair this with Multi-Speaker Dialogue.

Endpoint

POST https://api.sonnalabs.app/api/v1/tts/enhance-dialogue

Parameter	Type	Required	Description
`turns`	object[]	Yes	Array of `{ id, text, voiceId? }`. Total text across all turns ≤ 2,000 chars. `id` is echoed back so you can match turns.

{
  "turns": [
    {
      "id": "1",
      "text": "Have you tried the new model?",
      "voiceId": "voice-a"
    },
    {
      "id": "2",
      "text": "Just got it! The clarity is amazing.",
      "voiceId": "voice-b"
    }
  ]
}

Response (200 OK)

{
  "turns": [
    { "id": "1", "text": "[excited] Have you tried the new model?" },
    { "id": "2", "text": "[amazed] Just got it! The clarity is amazing." }
  ]
}

Errors

Status	Code	Reason
400	—	`text` (or `turns`) missing/empty
400	`TEXT_TOO_LONG`	Text exceeds the 2,000-character Enhance limit
401	—	Missing or invalid API key
429	—	Rate limit exceeded (shares the 30/min speech limit)

No credits charged

Enhance does not deduct credits. Only the subsequent synthesis call costs credits.

Enhance Text (Audio Tags)

On this page