Rate Limits

To protect our servers and external AI providers from abuse and denial-of-service attempts, Sonna enforces strict rate limits on all API routes.

Rate Limit Architecture

Sonna utilizes a Redis-backed token bucket rate limiter.

Rate limits are calculated atomically via a Redis Lua script.
Counters are synchronized across all running PM2 cluster nodes instantly.
Limits are evaluated either by User ID (for authenticated endpoints) or by IP Address (for public auth routes).

If you exceed a limit, the server responds with a 429 Too Many Requests status code.

Endpoint Limits

Here are the pre-configured limits for different classes of API endpoints:

Limit Class	Window	Request Limit	Applied to	Target Key
Speech Generation	1 minute	30 requests	TTS synthesis endpoints (`/api/v1/tts/synthesize`, `/api/v1/tts/synthesize-dialogue`)	User ID
Billing & Pricing	1 minute	10 requests	Verification, purchase restore, and pricing queries	User ID
Check Renewal	1 minute	4 requests	Plan renewal/expiry checking at startup	User ID
Authentication	5 minutes	10 requests	Login and token exchange endpoints (`/api/auth/google`, `/api/auth/mobile-login`)	IP Address

Rate Limit Headers

Sonna returns standard rate-limiting headers on all limited endpoints, allowing you to monitor your budget programmatically:

HTTP/1.1 200 OK
X-RateLimit-Limit: 30
X-RateLimit-Remaining: 27
X-RateLimit-Reset: 1718000523

X-RateLimit-Limit: The total request quota allowed in the current window.
X-RateLimit-Remaining: The remaining number of requests allowed in the current window.
X-RateLimit-Reset: The Unix epoch timestamp (in seconds) indicating when the current window resets and the quota is replenished.

Handling Rate Limit Violations (429)

When a rate limit is exceeded, you will receive:

Response Header

HTTP/1.1 429 Too Many Requests
Retry-After: 15

Retry-After: The number of seconds you must wait before making another request.

Response Body

{
  "error": "Too many requests. Please wait."
}

Implementation Advice

Always parse the Retry-After header when your client receives an HTTP 429 response, and implement exponential backoff retry logic in your API integration.