Text to Speech

Voice Cloning

Lip Sync Video

Other

Text to Speech (HTTP)

Convert text to speech using HTTP API

Text to Speech API

Endpoint

POST /api/open/tts

Request Headers

// JSON Format
Content-Type: application/json
Authorization: Bearer YOUR_API_TOKEN  // API Key

// MessagePack Format
Content-Type: application/msgpack
Authorization: Bearer YOUR_API_TOKEN  // API Key

Request Parameters

{
  "reference_id": string,  // Required, voice model ID
  "text": string,          // Required, text to convert
  "speed": number,         // Optional, speech speed, range: 0.5-2.0, default: 1
  "volume": number,        // Optional, volume, range: -20-20, default: 0
  "version": string,       // Optional, TTS version. Available: "v1", "v2", "s1" (traditional), "v3-turbo", "v3-hd" (v3), default: "v1"
  "format": string,        // Optional, audio format. Available: "mp3", "wav", "pcm", default: "mp3"
  "emotion": string,       // Optional, emotion control (v3 only). Available: "happy","sad","angry","fearful","disgusted","surprised","calm","auto", default: "auto"
  "language": string,      // Optional, language enhancement (v3 only). Available: "auto","zh","en", default: "auto"
  "cache": boolean         // Optional, false returns audio stream, true returns audio URL, default: false
}

Version Notes:

  • Legacy Versions: v1, v2, s1 (basic text-to-speech functionality)
  • V3 Versions: v3-turbo, v3-hd (advanced features including emotion control and language boost)
  • The system will automatically select the corresponding version based on model configuration, no manual specification needed

Response Data

// Success Response (cache=false) - 200
Content-Type: audio/mpeg
<Binary audio data>

// Success Response (cache=true) - 200
Content-Type: application/json
{
  "success": boolean,        // Whether successful
  "audio_url": string,       // Audio file URL
  "format": string,          // Audio format
  "characters_used": number, // Characters used
  "quota_remaining": number  // Remaining API credits
}

// Error Response
{
  "error": string     // Error message
}

CURL Example

# JSON Format - Traditional version (using s1 version, recommended)
curl -X POST https://fishaudio.net/api/open/tts \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -d '{
    "reference_id": "your_model_id",
    "text": "Text content to convert",
    "speed": 1.0,
    "volume": 0,
    "version": "s1",
    "format": "mp3",
    "cache": false
  }' \
  --output output.mp3

# JSON Format - V3 model (using HD version, supports emotion control and language enhancement)
curl -X POST https://fishaudio.net/api/open/tts \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -d '{
    "reference_id": "your_model_id",
    "text": "Text content to convert",
    "speed": 1.0,
    "volume": 0,
    "version": "v3-hd",
    "emotion": "calm",
    "language": "zh",
    "format": "mp3",
    "cache": false
  }' \
  --output output.mp3

# MessagePack Format (undefined)

Online Debug

Status Code Description

Status Code Description:
200 OK                  - Request successful
400 Bad Request         - Invalid request parameters
401 Unauthorized        - Invalid API Token
403 Forbidden          - Access forbidden
404 Not Found          - Resource not found
413 Payload Too Large  - Upload file too large
429 Too Many Requests  - Rate limit exceeded/Insufficient credits
500 Server Error       - Internal server error

Error Response Format:
{
  "error": string,      // Error message
  "details": string,    // Detailed error message (optional)
  "code": string       // Error code (optional)
}