OpenAI
Cheapest OpenAI gpt-image-2 channel via yunwu.ai. Sync API, text-to-image and multi-image editing capped at 1024-class output for predictable cost.
OpenAI
OpenAI GPT Image 2 (beta channel via kie.ai). Text-to-image and multi-image editing (up to 16 reference images), aspect-ratio controlled output. NSFW content allowed by default (filter off); pass nsfw_checker=true to enable SFW filtering.
OpenAI
OpenAI gpt-image-2 via RunningHub rhart-image-g-2. Text-to-image and multi-image editing (up to 10 reference images), controlled by aspect ratio.
Google VEO 3.1 Lite via OpenAI-style /v1/videos API. Reference image support, 4s/6s/8s durations, cost-effective video generation.
Google VEO 3.1 Lite 4K via OpenAI-style /v1/videos API. 4K resolution, reference image support, 4s/6s/8s durations.
Gemini 3 Pro Image via GeminiGen channel. Professional asset creation with advanced reasoning and high-fidelity text rendering.
Gemini 3.1 Flash Image via GeminiGen channel. High-performance image generation optimized for speed and high-volume use.
ByteDance
ByteDance Seedance 2 Omni Fast mode. 4-15s flexible duration, multiple aspect ratios, per-second pricing.
ByteDance
ByteDance Seedance 2 Omni Pro mode. 4-15s flexible duration, highest quality, cinematic output.
VEO 3.1 Fast HD (720p) video generation via GeminiGen. 8s fixed duration, 16:9 aspect ratio, reference image support.
VEO 3.1 Fast Full HD (1080p) video generation via GeminiGen. 8s fixed duration, 16:9 aspect ratio, reference image support.
xAI
Official Grok Video 3 via GeminiGen API. Fast generation with customizable resolution, duration (6/10/15s), and reference image support.
RunningHub
VEO 3.1 Fast 4K video generation via RunningHub. Requires start frame image. Supports start-end frame video generation.
SparkPix
Sub 1 second text-to-image model built for production use cases. State-of-the-art speed, quality, and text rendering.
SparkPix
Sub 1 second multi-image editing model. Fast, affordable AI image editing with precise prompt adherence and multi-image support.
Pruna AI
Fast video generation in ~10 seconds. Text/image/audio-to-video with draft mode for 4x faster previews. Built-in audio generation, up to 1080p 48FPS.
xAI
Multimodal AI image generation by X platform. Generates high-quality images from text descriptions.
xAI
Upgraded multimodal AI model by X platform with stronger understanding and finer detail generation for higher precision images.
Seedance
Hollywood-grade cinematic video generator. Dual-mode T2V & I2V with up to 4 reference images. Professional color grading, dramatic lighting, and smooth camera movement.
ByteDance
ByteDance DreamActor V2 motion transfer. Drive any character image with reference video motion, supporting multi-person, anime and pets.
Kling
Kling AI lip-sync video generation. Frame-level lip synchronization with audio for real humans, 3D and 2D characters.
Kling
Kling text-to-speech synthesis with multi-language support, voice cloning, speed control and emotion styles.
Kling
Kling V3 video via Stable-QN channel. Supports text-to-video and image-to-video, 3-15s with optional audio.
Kling
Kling V3 Omni-Video via Stable-QN channel. Multi-modal input with image_list, video_list and keep-original-sound.
Budget-friendly Gemini 3.1 Flash image generation. Supports text-to-image and image editing at lower cost.
MiniMax
MiniMax M2.5 reaches or sets new SOTA in coding, tool calling, search, and office productivity tasks.
Vidu
Fast video generation by Vidu Q3 Turbo. Supports text/image/start-end frame to video, 1-16s, 540p-1080p.
OpenAI
Uses more compute to think deeper and deliver consistently better answers. Supports multi-turn model interactions and advanced API features.
OpenAI
Our frontier model for complex professional work.
Most cost-effective multimodal model with fastest performance for high-frequency lightweight tasks.
Latest Pro model with enhanced reasoning and multimodal capabilities.
Kling
Create custom voice profiles from audio samples. Upload .mp3/.wav/.mp4/.mov (5-30s) or reference a video ID.
Kling
Sync one or multiple faces in a video with custom audio. Supports precise timing control.
Kling
Generate videos with character motion control. Provide a reference image and motion video to create animated content.
Kling
Identify faces in video for advanced lip-sync. Returns session ID and face IDs.
Anthropic
Latest Opus model with ultimate performance and reasoning capabilities.
Anthropic
Claude Opus 4.6 with extended thinking capability for the most complex reasoning tasks.
Anthropic
Latest Sonnet model with best performance and efficiency.
Anthropic
Claude Sonnet 4.6 with extended thinking capability for complex reasoning tasks.
Kling
AI image generation and editing by Kling. Supports 1k/2k resolution and multi-image input for creative editing.
Kling
Generate sound effects from text descriptions. 3-10 second audio with natural quality.
Kling
Kling OmniImage via Stable-QN channel. Text-to-image with reference image support, 1K/2K resolution.
Kling
Auto-generate sound effects and background music for videos. Supports ASMR mode for immersive content.
SeedVR
AI image upscaling and enhancement. Upscale images to 2K or 4K resolution with high quality detail preservation.
Kling
Text-to-speech with multiple voice options. Adjustable speed and multi-language support.
xAI
Trillion-parameter model with 16-Agent cluster collaboration, real-time data processing and self-evolution.
Fast image generation powered by Gemini 3.1 Flash. Supports text-to-image and image editing with 1K/2K/4K quality.
Minimax
High-definition async TTS by Minimax (海螺). Rich expressiveness with natural prosody. Supports voice clone and voice design.
Budget-friendly image editing powered by Gemini 3.1 Flash via RunningHub. Image-to-image only with 1K/2K/4K quality.
Minimax
Fast and cost-effective async TTS by Minimax (海螺). Supports voice clone, voice design, and pronunciation dictionaries.
xAI
Image generation and editing powered by Grok 4.2. Supports text-to-image creation and image editing with mask inpainting.
Fast and efficient multimodal model. Great for quick responses and simple tasks.
Advanced multimodal reasoning model with superior capabilities.
Gemini 3 Pro with extended thinking capability for complex reasoning tasks.
Doubao
Latest Doubao Seedream 5.0 Lite image generation. Supports text-to-image, image editing, and multi-image fusion with 2K/3K resolution.
OpenAI
Latest GPT model with advanced reasoning and enhanced capabilities.
OpenAI
GPT-5.2 optimized for conversational interactions.
Doubao
High quality Doubao Seedream 4.5 image generation. Supports text-to-image and image editing with 2K/4K resolution.
Google VEO 3.1 standard mode. Premium quality with audio generation support. 5s or 8s video output.
OpenAI
Powerful model with excellent performance and efficiency.
OpenAI
Snapshot version of GPT-5.1 for reproducible outputs.
Anthropic
Latest Opus model with enhanced capabilities and improved reasoning.
Anthropic
Claude Opus 4.5 with extended thinking capability for the most complex reasoning tasks.
OpenAI
GPT-5 with integrated web search for real-time information.
OpenAI
Snapshot version of GPT-5 Search API for stable deployments.
xAI
Latest Grok video model with synchronized audio and video generation, 10-second output.
xAI
High-quality 5-second video generation powered by Grok. Supports horizontal and vertical aspect ratios.
Anthropic
Fast and affordable model for lightweight tasks. Best for simple queries and quick responses.
Anthropic
Claude Haiku 4.5 with extended thinking capability for complex reasoning tasks.
OpenAI
Versatile model for general-purpose tasks.
Anthropic
Latest Sonnet model with improved performance and efficiency.
Anthropic
Claude Sonnet 4.5 with extended thinking capability for complex reasoning tasks.
Premium image generation powered by Gemini 3 Pro. 99% success rate. Best quality and reliability.
High quality image generation powered by Gemini 3 Pro. 97% success rate. Supports text-to-image and image editing.
Powerful multimodal model for complex tasks with excellent performance.
Gemini 2.5 Pro with extended thinking capability for complex reasoning.
Fast and cost-effective multimodal model. Best balance of speed and quality.
Gemini 2.5 Flash with extended thinking capability for reasoning tasks.
Fast image generation powered by Gemini 2.5 Flash. Supports text-to-image and image editing with natural language.
OpenAI
Fast reasoning model with efficient output generation.
OpenAI
Advanced reasoning model for complex analytical tasks.
Lightweight and ultra-fast model. Best for simple tasks and high volume.
Anthropic
Most capable model with superior reasoning and analysis capabilities.
Anthropic
Claude Opus 4 with extended thinking capability for the most complex reasoning tasks.
Anthropic
Balanced model with excellent performance and cost efficiency. Great for most tasks.
Anthropic
Claude Sonnet 4 with extended thinking capability for complex reasoning tasks.
OpenAI
Small embedding model, efficient and cost-effective for most use cases.
OpenAI
Large embedding model for higher accuracy and flexible dimensions.