Kitta AI API documentation and playground
Text to Speech
Voice Cloning
Lip Sync Video
Other
Text to Speech
Voice Cloning
Lip Sync Video
Other
Text to Speech (HTTP)
Convert text to speech using HTTP API
Text to Speech API
Endpoint
POST /api/open/tts
Request Headers
// JSON Format Content-Type: application/json Authorization: Bearer YOUR_API_TOKEN // API Key // MessagePack Format Content-Type: application/msgpack Authorization: Bearer YOUR_API_TOKEN // API Key
Request Parameters
{
"reference_id": string, // Required, voice model ID
"text": string, // Required, text to convert
"speed": number, // Optional, speech speed, range: 0.5-2.0, default: 1
"volume": number, // Optional, volume, range: -20-20, default: 0
"version": string, // Optional, TTS version. Available: "v1", "v2", "s1" (traditional), "v3-turbo", "v3-hd" (v3), default: "v1"
"format": string, // Optional, audio format. Available: "mp3", "wav", "pcm", default: "mp3"
"emotion": string, // Optional, emotion control (v3 only). Available: "happy","sad","angry","fearful","disgusted","surprised","calm","auto", default: "auto"
"language": string, // Optional, language enhancement (v3 only). Available: "auto","zh","en", default: "auto"
"cache": boolean // Optional, false returns audio stream, true returns audio URL, default: false
}Version Notes:
- • Legacy Versions: v1, v2, s1 (basic text-to-speech functionality)
- • V3 Versions: v3-turbo, v3-hd (advanced features including emotion control and language boost)
- • The system will automatically select the corresponding version based on model configuration, no manual specification needed
Response Data
// Success Response (cache=false) - 200
Content-Type: audio/mpeg
<Binary audio data>
// Success Response (cache=true) - 200
Content-Type: application/json
{
"success": boolean, // Whether successful
"audio_url": string, // Audio file URL
"format": string, // Audio format
"characters_used": number, // Characters used
"quota_remaining": number // Remaining API credits
}
// Error Response
{
"error": string // Error message
}CURL Example
# JSON Format - Traditional version (using s1 version, recommended)
curl -X POST https://fishaudio.net/api/open/tts \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-d '{
"reference_id": "your_model_id",
"text": "Text content to convert",
"speed": 1.0,
"volume": 0,
"version": "s1",
"format": "mp3",
"cache": false
}' \
--output output.mp3
# JSON Format - V3 model (using HD version, supports emotion control and language enhancement)
curl -X POST https://fishaudio.net/api/open/tts \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-d '{
"reference_id": "your_model_id",
"text": "Text content to convert",
"speed": 1.0,
"volume": 0,
"version": "v3-hd",
"emotion": "calm",
"language": "zh",
"format": "mp3",
"cache": false
}' \
--output output.mp3
# MessagePack Format (undefined)Online Debug
Status Code Description
Status Code Description:
200 OK - Request successful
400 Bad Request - Invalid request parameters
401 Unauthorized - Invalid API Token
403 Forbidden - Access forbidden
404 Not Found - Resource not found
413 Payload Too Large - Upload file too large
429 Too Many Requests - Rate limit exceeded/Insufficient credits
500 Server Error - Internal server error
Error Response Format:
{
"error": string, // Error message
"details": string, // Detailed error message (optional)
"code": string // Error code (optional)
}