Models/Kling Custom Voice

Kling Custom Voice

kling-custom-voice

Kling Custom Voice creates a reusable custom voice from an audio sample — upload 5–30 seconds of clean, single-speaker audio (.mp3/.wav/.mp4/.mov) or reference a historical video ID. The resulting voice can be used in Kling TTS and the Kling Lip-Sync models, so a digital human or narration can speak in your proprietary voice and then be lip-synced to video.

Custom VoiceAudio UploadVideo ReferenceFor TTS/Lip Sync

per call$0.006/image

Audio Upload

Upload .mp3/.wav/.mp4/.mov samples

Video Reference

Use a historical video ID as source

Voice Profile

Create reusable voice profiles

Low Cost

$0.006 per voice creation

API Docs

Create Custom Voice

Voice Name *

Voice Source

Click to upload .mp3 / .wav / .mp4 / .mov

Clean single voice, 5-30 seconds, no background noise

Result

Create a custom voice to see the result

Last updated: 2026-06-21

TL;DR Kling Custom Voice is a Kling audio & speech model, callable via API Models' unified API (model name `kling-custom-voice`). Pricing: per call: $0.006. One API key for all image / video / LLM / audio models — 60-95% cheaper than official.

About Kling Custom Voice

Kling Custom Voice is a Audio & Speech API provided by Kling. Kling Custom Voice creates a reusable custom voice from an audio sample — upload 5–30 seconds of clean, single-speaker audio (.mp3/.wav/.mp4/.mov) or reference a historical video ID. The resulting voice can be used in Kling TTS and the Kling Lip-Sync models, so a digital human or narration can speak in your proprietary voice and then be lip-synced to video. Through API Models platform, you can access this model via a unified API at prices significantly lower than official rates. Current pricing: per call: $0.006.

Key Features

Audio Upload -- Upload .mp3/.wav/.mp4/.mov samples
Video Reference -- Use a historical video ID as source
Voice Profile -- Create reusable voice profiles
Low Cost -- $0.006 per voice creation

Use Cases

Voiceover & Narration

Generate professional-grade voiceovers for videos, animations, and ads with diverse voice options.

Podcast Production

Quickly produce podcast audio content with support for multi-character dialogue.

Audiobook Creation

Convert text content into natural, fluid speech for audiobook production.

Multilingual Dubbing

AI-powered multilingual dubbing and translation to help content reach global audiences.

Why API Models

Unified API -- One API key to access all models, no need to register on multiple platforms
Cost Savings -- 60-95% cheaper than official pricing, ideal for indie developers and startups
Instant Access -- Start using immediately after signup, supports Stripe and Alipay payments
Full Documentation -- Detailed API docs with code examples in cURL, Python, and Node.js

Frequently Asked Questions

How much does Kling Custom Voice cost?

Kling Custom Voice is available through API Models at: per call: $0.006. This is up to 95% cheaper than official pricing.

How to use Kling Custom Voice API?

Sign up at API Models, get your API key, and call our unified API endpoint. We provide detailed API documentation with code examples in cURL, Python, and Node.js.

What is the difference between API Models and the official Kling API?

API Models offers the same Kling Custom Voice model at 60-95% lower cost through our aggregation platform. We provide a unified API interface so you do not need separate accounts for each provider - one API key to access all models.

What is Kling Custom Voice?

It creates a custom voice from an audio sample: upload 5–30 seconds of clean, single-speaker audio (.mp3/.wav/.mp4/.mov) or reference a historical video ID. The resulting voice can be used in Kling TTS and the Lip-Sync models.

How do I use the cloned voice?

Once cloned, select that voice in Kling TTS or Kling Lip-Sync TTS to synthesize speech — so a digital human or narration speaks in your proprietary voice — then pair it with lip-sync video.

How does Kling Custom Voice compare to other Audio & Speech models?

On API Models, Kling Custom Voice runs alongside 60+ models on one API key and one balance, so choosing is about fit, not lock-in. It supports Custom Voice, Audio Upload, Video Reference, For TTS/Lip Sync, and you can weigh it on price and capability against other Audio & Speech models, then switch by changing a single model-name string — no new account or integration. Browse every Audio & Speech option with live pricing at apimodels.app/models.

What can Kling Custom Voice do?

Kling Custom Voice supports: Custom Voice, Audio Upload, Video Reference, For TTS/Lip Sync. See the API Models docs for full parameters and call examples.

Can I access the Kling Custom Voice API from anywhere (incl. China)?

Yes. API Models exposes Kling Custom Voice through a single unified API and one key — no separate provider accounts, and no need to handle each provider's regional network access yourself.

What payment methods are supported?

We support Stripe (Visa, Mastercard, and other international cards) and Alipay. Credits are available instantly after payment.