Voices · TTS

Voices & Voice Cloning API

List available voices, clone your own from a short sample, or design a brand-new voice from a text description — then use it in any TTS call. Covers Minimax (clone / design / system voices) and ElevenLabs custom voices.

Endpoints at a glance

List system voices: GET /minimax/voices · /audio/voices
List my voices: GET /voices?source=account
Upload sample: POST /minimax/files
Clone voice: POST /minimax/voice/clone
Design voice: POST /minimax/voice/design
Custom voice (EL): POST /audio/voices/custom

⚠️ Billing: the first time a cloned / designed voice is used for TTS, a one-time activation fee (¥9.9) applies. System preset voices have no such fee.

1. List voices

Grab a voice_id and drop it into a TTS call. System voices are free and ready to use; source=account returns the voices you cloned / designed / created.

# Minimax system voices (optionally filter by language)
curl "https://apimodels.app/api/v1/minimax/voices?language=English" \
  -H "Authorization: Bearer $API_KEY"

# ElevenLabs / preset voices
curl "https://apimodels.app/api/v1/audio/voices?pageNum=1&pageSize=30" \
  -H "Authorization: Bearer $API_KEY"

# Voices on your own account (cloned / designed / custom)
curl "https://apimodels.app/api/v1/voices?source=account" \
  -H "Authorization: Bearer $API_KEY"

Response (Minimax example)

{
  "voices": [
    { "voice_id": "English_Graceful_Lady", "name": "Graceful Lady", "language": "English" },
    { "voice_id": "male-qn-qingse", "name": "Qingse", "language": "Chinese" }
  ],
  "total": 2
}

2. Clone a voice (3 steps)

Prepare a clean voice sample (10s-5min, no background noise). Upload to get a file_id → clone into your chosen voice_id → use it in TTS.

Step 1: upload the sample

# Step 1 — upload your voice sample (10s-5min clean audio)
curl -X POST https://apimodels.app/api/v1/minimax/files \
  -H "Authorization: Bearer $API_KEY" \
  -F "purpose=voice_clone" \
  -F "file=@my-voice-sample.mp3"
# -> { "file": { "file_id": 123456789, ... } }

Step 2: clone

# Step 2 — clone it into a new voice_id (must start with a letter, >= 8 chars)
curl -X POST https://apimodels.app/api/v1/minimax/voice/clone \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "file_id": 123456789,
    "voice_id": "myBrandVoice01"
  }'

Field	Required	Type	Description
file_id	yes	number	The file ID returned by the step-1 upload
voice_id	yes	string	Your new voice ID — must start with a letter, >= 8 chars

Step 3: synthesize with the cloned voice

# Step 3 — use the cloned voice in TTS
curl -X POST https://apimodels.app/api/v1/minimax/tts \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "minimax-speech-02-turbo",
    "text": "This sentence is spoken in my own cloned voice.",
    "voice_setting": { "voice_id": "myBrandVoice01", "speed": 1.0 }
  }'

Full 3-step flow (Python / Node)

cURL

# Step 2 — clone it into a new voice_id (must start with a letter, >= 8 chars)
curl -X POST https://apimodels.app/api/v1/minimax/voice/clone \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "file_id": 123456789,
    "voice_id": "myBrandVoice01"
  }'

Python

import os, requests

API = "https://apimodels.app/api/v1"
H = {"Authorization": f"Bearer {os.environ['API_KEY']}"}

# 1) Upload the sample
file_id = requests.post(
    f"{API}/minimax/files",
    headers=H,
    data={"purpose": "voice_clone"},
    files={"file": open("my-voice-sample.mp3", "rb")},
).json()["file"]["file_id"]

# 2) Clone -> your new voice_id
requests.post(
    f"{API}/minimax/voice/clone",
    headers={**H, "Content-Type": "application/json"},
    json={"file_id": file_id, "voice_id": "myBrandVoice01"},
)

# 3) Speak with it
audio = requests.post(
    f"{API}/minimax/tts",
    headers={**H, "Content-Type": "application/json"},
    json={
        "model": "minimax-speech-02-turbo",
        "text": "This sentence is spoken in my own cloned voice.",
        "voice_setting": {"voice_id": "myBrandVoice01"},
    },
)
print(audio.json())

Node.js

const API = "https://apimodels.app/api/v1";
const H = { Authorization: `Bearer ${process.env.API_KEY}` };
import fs from "node:fs";

// 1) Upload the sample
const form = new FormData();
form.append("purpose", "voice_clone");
form.append("file", new Blob([fs.readFileSync("my-voice-sample.mp3")]), "sample.mp3");
const { file } = await fetch(`${API}/minimax/files`, { method: "POST", headers: H, body: form })
  .then((r) => r.json());

// 2) Clone -> your new voice_id
await fetch(`${API}/minimax/voice/clone`, {
  method: "POST",
  headers: { ...H, "Content-Type": "application/json" },
  body: JSON.stringify({ file_id: file.file_id, voice_id: "myBrandVoice01" }),
});

// 3) Speak with it
const audio = await fetch(`${API}/minimax/tts`, {
  method: "POST",
  headers: { ...H, "Content-Type": "application/json" },
  body: JSON.stringify({
    model: "minimax-speech-02-turbo",
    text: "This sentence is spoken in my own cloned voice.",
    voice_setting: { voice_id: "myBrandVoice01" },
  }),
}).then((r) => r.json());
console.log(audio);

3. Design a voice (no sample)

Generate a brand-new voice from a text description alone — handy when you have no recording. You get a voice_id back to use in TTS.

# Create a brand-new voice from a text description (no sample needed)
curl -X POST https://apimodels.app/api/v1/minimax/voice/design \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A warm, calm middle-aged male narrator with a slight British accent",
    "preview_text": "Welcome to the evening news.",
    "voice_id": "designedVoice01"
  }'

Field	Required	Type	Description
prompt	yes	string	Text description of the target voice (gender, age, accent, tone…)
preview_text	no	string	Text used to render a preview
voice_id	yes	string	Your new voice ID

4. Custom voice (ElevenLabs)

Create an ElevenLabs custom voice from a public audio URL (or upload first via the Files API). Provide either voice_url or video_id.

# ElevenLabs-style custom voice from a hosted audio URL (or video_id)
curl -X POST https://apimodels.app/api/v1/audio/voices/custom \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "voice_name": "My Narrator",
    "voice_url": "https://r2.apimodels.app/uploads/.../sample.mp3"
  }'

Field	Required	Type	Description
voice_name	yes	string	Display name for the voice
voice_url	one of	string	Public URL of the sample audio
video_id	one of	string	ID of previously uploaded media (instead of voice_url)
callback_url	no	string	Webhook called when ready

Next: Audio Models API Files API Create an API key