Quick Start
Get started with Sonna in 5 minutes
This guide will walk you through creating your first voice profile and generating speech.
Prerequisites
Make sure you have installed Sonna and launched the app.
Step 1: Create a Voice Profile
Voice profiles are the foundation of Sonna. Each profile contains voice samples that the AI uses to clone the voice.
Click the Profiles tab in the sidebar
Click the + New Profile button
Fill in the details:
- Name: A descriptive name (e.g., "John Smith")
- Language: Select the primary language
- Description: Optional notes about the voice
You have two options:
Option A: Upload Audio
- Click Upload Sample
- Select an audio file (WAV, MP3, or M4A)
- Ideal length: 10-30 seconds of clear speech
Option B: Record Live
- Click Record Sample
- Speak clearly for 10-30 seconds
- Click stop when finished
Click Create Profile to save
For best results, use clean audio with minimal background noise and consistent speaking tone.
Step 2: Generate Speech
Now let's use your new voice profile to generate speech.
Click the Generate tab in the sidebar
Choose your newly created profile from the dropdown
Type or paste the text you want to generate:
Hello! This is my first voice generation with Sonna.Paralinguistic tags like [laugh], [sigh], and [gasp] only work with
Chatterbox Turbo. Qwen3-TTS, LuxTTS, Chatterbox Multilingual, and
HumeAI TADA will read those tags literally instead of turning them into
expressive sounds.
To insert supported tags, select Chatterbox Turbo and type / in the
text input to open the tag inserter.
Click Generate and wait a few seconds
First generation may take longer due to model initialization. Subsequent generations will be faster.
- Click Play to preview the audio
- Click Download to save the audio file
- The generation is also saved to your History
Step 3: Build a Story (Optional)
The Stories Editor lets you create multi-voice narratives with a timeline-based interface.
Navigate to Stories and click + New Story
Click + Add Track to create tracks for different speakers
- Drag generated audio from your History
- Or generate new clips directly in the timeline
- Arrange clips on the timeline
- Trim clips by dragging edges
- Adjust timing and spacing
- Click Export to render the final audio
What's Next?
Voice Cloning Guide
Learn advanced techniques for high-quality voice cloning
API Integration
Integrate Sonna into your own applications
Stories Editor
Master the multi-track timeline editor
Remote Mode
Connect to a GPU server for faster generation
Tips for Success
▶Getting the Best Voice Quality
- Use 10-30 seconds of clear, consistent speech
- Avoid background noise and echo
- Multiple samples from the same speaker improve quality
- Match the speaking style you want to generate
▶Improving Generation Speed
- Use a CUDA-capable GPU for 5-10x faster generation
- Enable voice prompt caching for repeated generations
- Consider running the backend on a remote GPU server
▶Troubleshooting Common Issues
- Server won't start: Check if port 17493 is available
- Poor audio quality: Try adding more voice samples
- Slow generation: Verify GPU acceleration is enabled
- See the full Troubleshooting Guide for more