
kling-identify-faceKling Face Recognition detects faces in a video — pass a videoUrl or videoId and it returns a sessionId plus a list of faceIds. Those IDs feed Kling Lip-Sync Video: it's the first step of the lip-sync flow, letting you target exactly which face to sync in a multi-person clip before aligning audio to that face.
Auto-detect all faces in a video
Returns sessionId + faceId for lip-sync
Accepts videoUrl or videoId
$0.01 per call
Kling lip-sync is a 3-step flow — run them in order; intermediate values carry forward automatically.
Upload or paste a public video URL; recognition returns sessionId + faceId.
Trimmed audio must be ≥2s; the insert window must overlap the face window by ≥2s.
Kling Face Recognition is a Video Generation API provided by Kling. Kling Face Recognition detects faces in a video — pass a videoUrl or videoId and it returns a sessionId plus a list of faceIds. Those IDs feed Kling Lip-Sync Video: it's the first step of the lip-sync flow, letting you target exactly which face to sync in a multi-person clip before aligning audio to that face. Through API Models platform, you can access this model via a unified API at prices significantly lower than official rates. Current pricing: per call: $0.01.
Quickly generate brand promotion videos for ad campaigns and social media marketing.
Create compelling short-form video content for platforms like TikTok, Instagram, and YouTube.
Generate product feature demonstrations and tutorials to improve user conversion.
Produce course explanations, knowledge explainers, and training videos at low cost.
Kling Face Recognition is available through API Models at: per call: $0.01. This is up to 95% cheaper than official pricing.
Sign up at API Models, get your API key, and call our unified API endpoint. We provide detailed API documentation with code examples in cURL, Python, and Node.js.
API Models offers the same Kling Face Recognition model at 60-95% lower cost through our aggregation platform. We provide a unified API interface so you do not need separate accounts for each provider - one API key to access all models.
It detects faces in a video: pass a videoUrl or videoId and it returns a sessionId plus a list of faceIds. Those IDs feed Kling Lip-Sync Video — identify the face you want to lip-sync, then sync that specific face.
It's the first step of lip-sync: detect faces → get a faceId → specify that faceId plus audio in Kling Lip-Sync Video to produce the synced clip. In multi-person videos it lets you target exactly which face to lip-sync.
On API Models, Kling Face Recognition runs alongside 60+ models on one API key and one balance, so choosing is about fit, not lock-in. It supports Face Detection, Video Input, Session ID, Face ID List, and you can weigh it on price and capability against other Video Generation models, then switch by changing a single model-name string — no new account or integration. Browse every Video Generation option with live pricing at apimodels.app/models.
Kling Face Recognition supports: Face Detection, Video Input, Session ID, Face ID List. See the API Models docs for full parameters and call examples.
Yes. API Models exposes Kling Face Recognition through a single unified API and one key — no separate provider accounts, and no need to handle each provider's regional network access yourself.
We support Stripe (Visa, Mastercard, and other international cards) and Alipay. Credits are available instantly after payment.