Lightricks
LTX-2.3 unified text-to-video and image-to-video. Send a prompt for T2V, or add one reference image for I2V — fast, with 480p / 720p / 1080p output. Billed per second by resolution.
Minimax
Latest high-fidelity TTS by MiniMax (海螺). Predicts emotion and intonation from context for ultra-natural, expressive, personalized speech. Supports voice clone and voice design.
Minimax
Latest fast, cost-effective async TTS by MiniMax (海螺). Great quality-to-price for high-volume synthesis. Supports voice clone and voice design.
Minimax
High-fidelity TTS by MiniMax (海螺). Predicts emotion and intonation from context to produce ultra-natural, expressive, personalized speech — built for social, podcasts, audiobooks, news, education and digital humans. Supports voice clone and voice design.
xAI
Grok Imagine Video 1.5 (Beta) — an alternative RunningHub channel for xAI Grok image-to-video. Turn one reference image into a cinematic clip with an optional prompt. Simple flat per-clip pricing by duration (5 / 8 / 10 / 12 / 15s), 480p or 720p.
xAI
xAI Grok Imagine Video 1.5 Preview — image-to-video with native synchronized audio. #1 on the Image-to-Video Arena, with lifelike motion, strong prompt adherence and consistent characters. 480p / 720p, 1-15s.
Kling
Kling V3 image generation. Text-to-image and single-reference image-to-image, 1K/2K resolution. $0.05 per image.
Kling
Kling V3 Omni image generation. Multi-image reference & fusion, element consistency, single/series output, 1K/2K/4K — 1K/2K $0.05, 4K $0.10 per image.
Omni Flash (Stable) — lower-cost, full-suite Gemini Omni video. Text / image (up to 7 refs) / video-to-video, plus reusable voices and consistent characters. 720p / 1080p / 4k, 4 / 6 / 8 / 10s, 16:9 or 9:16, optional seed.
OpenAI
GPT-5.4 with maximum reasoning effort — for the hardest multi-step problems where you want the model to spend most of its budget thinking before answering.
OpenAI
GPT-5.4 with high reasoning effort — for complex debugging, cross-file analysis, and design tradeoffs.
OpenAI
GPT-5.4 with medium reasoning effort — for multi-step refactors and design choices that benefit from a few extra thinking tokens.
OpenAI
GPT-5.4 with low reasoning effort — light reasoning for day-to-day coding tasks; faster and cheaper than higher tiers.
Anthropic
Anthropic's most capable model yet — built to autonomously carry long, complex work end to end. Ideal for big projects, building agents, and high-stakes scenarios demanding top quality and autonomy.
Anthropic
Latest Opus model with 1M context, 128K max output, and adaptive thinking — same tools and platform features as Opus 4.6.
Anthropic
Claude Opus 4.7 with extended thinking explicitly enabled for the most complex reasoning tasks.
GA release. Our most intelligent Flash model — consistent leadership on agentic execution, coding, and long-horizon tasks at scale.
xAI
Multimodal AI image generation by X platform. Generates high-quality images from text descriptions.
xAI
Upgraded multimodal AI model by X platform with stronger understanding and finer detail generation for higher precision images.
ByteDance
ByteDance Seedance 2.0 cinematic video — direct official Volcengine API, stable and high-concurrency. Text, image and multimodal generation with friendly per-second pricing.
Gemini Omni Flash — unified video generator for both text-to-video and image-to-video (1 or 3 reference images). 720p / 1080p / 4k, 4 / 6 / 8 / 10s, optional 16:9 or 9:16 framing. One slug, two modes — drop in a prompt, optionally drop in images.
ByteDance
Film-grade edition of Seedance 2.0 — cinematic lighting, mood and camera motion, and it ACCEPTS real-person / realistic human reference images (unlike Ark-direct Seedance 2.0, which rejects real faces). Up to 4 reference images for identity-locked image-to-video — ideal for film-grade portrait and character work. Quality tier sits above the Ark variants; generation takes longer (typically 60-180s). Use only with consented subjects.
ByteDance
Seedance 2.0 Fast — direct official Volcengine API, faster and lower-cost. Text, image and multimodal video generation, stable under high concurrency.
ByteDance
ByteDance DreamActor V2 motion transfer. Drive any character image with reference video motion, supporting multi-person, anime and pets.
Kling
Kling AI lip-sync video generation. Frame-level lip synchronization with audio for real humans, 3D and 2D characters.
Kling
Kling text-to-speech synthesis with multi-language support, voice cloning, speed control and emotion styles.
Doubao
Doubao Seedream 5.0 Lite via ByteDance Volcano Ark official API. Unified text-to-image and image-to-image (pass image for I2I, omit for T2I). 2K / 4K output, no watermark, PNG.
Smooth cinematic transitions between a required first frame and required last frame. Outputs 720p or 1080p with native audio. Official stable channel — pricier than V3.1-fast but reliable, ideal for production.
OpenAI
Cheapest OpenAI gpt-image-2 channel. Sync image-to-image edit API with multi-image fusion (up to 10), 1K/2K/4K output and quality control. Flat $0.03 per image.
OpenAI
OpenAI GPT Image 2 (beta channel). Text-to-image and multi-image editing (up to 16 reference images), aspect-ratio controlled output. Independent channel from the primary gpt-image-2 route for redundancy. Flat $0.03 per image.
OpenAI
OpenAI gpt-image-2. Text-to-image and multi-image editing (up to 10 reference images), aspect-ratio control, flat $0.03 per image at Medium/High quality.
Google VEO 3.1 Lite via OpenAI-style /v1/videos API. Reference image support, 4s/6s/8s durations, cost-effective video generation.
Google VEO 3.1 Lite 4K via OpenAI-style /v1/videos API. 4K resolution, reference image support, 4s/6s/8s durations.
Gemini 3 Pro Image via a budget channel. Professional asset creation with advanced reasoning and high-fidelity text rendering.
Gemini 3.1 Flash Image via a budget channel. High-performance image generation optimized for speed and high-volume use.
VEO 3.1 Fast HD (720p) video generation. 8s fixed duration, 16:9 aspect ratio, reference image support.
VEO 3.1 Fast Full HD (1080p) video generation. 8s fixed duration, 16:9 aspect ratio, reference image support.
xAI
Grok Video 3 (alias of grok-video-3, same upstream). Per-second pricing $0.01/s, 6-30s output, T2V + I2V supported.
VEO 3.1 Fast 4K video generation. Requires start frame image. Supports start-end frame video generation.
SparkPix
Sub 1 second text-to-image model built for production use cases. State-of-the-art speed, quality, and text rendering.
SparkPix
Sub 1 second multi-image editing model. Fast, affordable AI image editing with precise prompt adherence and multi-image support.
Pruna AI
Fast video generation in ~10 seconds. Text/image/audio-to-video with draft mode for 4x faster previews. Built-in audio generation, up to 1080p 48FPS.
Budget-friendly Gemini 3.1 Flash image generation. Text-to-image and image editing — 1K/2K $0.05, 4K $0.08 per image.
MiniMax
MiniMax M2.5 reaches or sets new SOTA in coding, tool calling, search, and office productivity tasks.
OpenAI
Uses more compute to think deeper and deliver consistently better answers. Supports multi-turn model interactions and advanced API features.
OpenAI
Our frontier model for complex professional work.
Most cost-effective multimodal model with fastest performance for high-frequency lightweight tasks.
Latest Pro model with enhanced reasoning and multimodal capabilities.
Kling
Create custom voice profiles from audio samples. Upload .mp3/.wav/.mp4/.mov (5-30s) or reference a video ID.
Kling
Generate videos with character motion control. Provide a reference image and motion video to create animated content.
Kling
Identify faces in a video and return a session ID and face IDs for Kling lip-sync video generation.
Anthropic
Latest Opus model with ultimate performance and reasoning capabilities.
Anthropic
Claude Opus 4.6 with extended thinking capability for the most complex reasoning tasks.
Anthropic
Latest Sonnet model with best performance and efficiency.
Anthropic
Claude Sonnet 4.6 with extended thinking capability for complex reasoning tasks.
Kling
AI image generation and editing by Kling (omni-image, model kling-image-o1). Supports 1K/2K resolution and multi-image input. $0.05 per image.
Kling
Generate sound effects from text descriptions. 3-10 second audio with natural quality.
Kling
Auto-generate sound effects and background music for videos. Supports ASMR mode for immersive content.
SeedVR
AI image upscaling and enhancement. Upscale images to 2K or 4K resolution with high quality detail preservation.
Kling
Text-to-speech with multiple voice options. Adjustable speed and multi-language support.
Kling
Kling V3 Omni-Video with extended duration and keep-original-sound support for video editing. Flat $0.15/s billing.
xAI
Trillion-parameter model with 16-Agent cluster collaboration, real-time data processing and self-evolution.
Fast image generation powered by Gemini 3.1 Flash. Supports text-to-image and image editing — 1K/2K $0.05, 4K $0.08 per image.
Minimax
High-definition async TTS by Minimax (海螺). Rich expressiveness with natural prosody. Supports voice clone and voice design.
Budget-friendly image editing powered by Gemini 3.1 Flash. Image-to-image only — 1K/2K $0.04, 4K $0.07 per image.
Minimax
Fast and cost-effective async TTS by Minimax (海螺). Supports voice clone, voice design, and pronunciation dictionaries.
xAI
Image generation and editing powered by Grok 4.2. Supports text-to-image creation and image editing with mask inpainting.
Fast and efficient multimodal model. Great for quick responses and simple tasks.
Advanced multimodal reasoning model with superior capabilities.
Gemini 3 Pro with extended thinking capability for complex reasoning tasks.
Kling
Latest Kling V3 video generation. Supports 3-15s flexible duration, text-to-video and image-to-video with optional audio.
Doubao
High quality Doubao Seedream 4.5 image generation. Supports text-to-image and image editing with 2K/4K resolution.
High-performance start-end frame video. Provide first + optional last frame, the model interpolates motion between them in seconds. Budget channel — cheaper than the official VEO, less stable.
OpenAI
Powerful model with excellent performance and efficiency.
OpenAI
Snapshot version of GPT-5.1 for reproducible outputs.
Anthropic
Latest Opus model with enhanced capabilities and improved reasoning.
Anthropic
Claude Opus 4.5 with extended thinking capability for the most complex reasoning tasks.
OpenAI
GPT-5 with integrated web search for real-time information.
OpenAI
Snapshot version of GPT-5 Search API for stable deployments.
xAI
Grok Video 3. 10-second video at $0.01/s. Supports text-to-video and image-to-video (up to 7 reference images).
xAI
Grok Video 3. Per-second pricing $0.01/s, 6-30 second output. Both T2V (omit images) and I2V (1-7 reference images) supported.
Anthropic
Fast and affordable model for lightweight tasks. Best for simple queries and quick responses.
Anthropic
Claude Haiku 4.5 with extended thinking capability for complex reasoning tasks.
OpenAI
Versatile model for general-purpose tasks.
Anthropic
Latest Sonnet model with improved performance and efficiency.
Anthropic
Claude Sonnet 4.5 with extended thinking capability for complex reasoning tasks.
Premium image generation powered by Gemini 3 Pro. 99% success rate. Best quality and reliability.
High quality image generation powered by Gemini 3 Pro. 97% success rate. Supports text-to-image and image editing.
Powerful multimodal model for complex tasks with excellent performance.
Gemini 2.5 Pro with extended thinking capability for complex reasoning.
Fast and cost-effective multimodal model. Best balance of speed and quality.
Gemini 2.5 Flash with extended thinking capability for reasoning tasks.
Fast image generation powered by Gemini 2.5 Flash. Supports text-to-image and image editing with natural language.
OpenAI
Fast reasoning model with efficient output generation.
OpenAI
Advanced reasoning model for complex analytical tasks.
Lightweight and ultra-fast model. Best for simple tasks and high volume.
Anthropic
Most capable model with superior reasoning and analysis capabilities.
Anthropic
Claude Opus 4 with extended thinking capability for the most complex reasoning tasks.
Anthropic
Balanced model with excellent performance and cost efficiency. Great for most tasks.
Anthropic
Claude Sonnet 4 with extended thinking capability for complex reasoning tasks.
OpenAI
Small embedding model, efficient and cost-effective for most use cases.
OpenAI
Large embedding model for higher accuracy and flexible dimensions.