Explore


jaaari / kokoro-82m
Kokoro v1.0 - text-to-speech (82M params, based on StyleTTS2)
96.5M runs


fottoai / remove-bg-2
Remove image background with custom model to better result.
3.9M runs


andreasjansson / clip-features
Return CLIP features for the clip-vit-large-patch14 model
161M runs


aisha-ai-official / animagine-xl-v4-opt
15.5M runs
Featured models

Foundation image model from Krea, tuned for expressive illustration, anime, and painterly styles. Fast and consistent across artistic directions.
8.4K runs
Alibaba's Happy Horse 1.0 generates videos from text prompts or animates a single image into video. Supports 720p and 1080p, 3-15 second durations, and five aspect ratios.
25.2K runs

openai/gpt-image-2OpenAI's state-of-the-art image generation model. Create and edit images from text with strong instruction following, sharp text rendering, and detailed editing.
9.7M runs

Anthropic's most capable model with a step-change improvement in agentic coding, better vision, and stronger multi-step reasoning
116.8K runs

Google's fast, expressive text-to-speech model with 30 voices and 70+ language support
205.9K runs

Generate full-length songs or instrumentals from a text prompt, with optional auto-generated lyrics
14.7K runs

bytedance/seedance-2.0ByteDance's multimodal video generation model with native audio, multimodal reference inputs, and intelligent duration control.
876.4K runs
prunaai/p-video-avatarp-video-avatar is the fastest and cheapest avatar/lipsync video model on the market.
77.8K runs

bytedance/seedream-5-liteSeedream 5.0 lite: image generation with built-in reasoning, example-based editing, and deep domain knowledge
2.7M runs
Generate videos using xAI's Grok Imagine Video model
1.2M runs

The highest fidelity image model from Black Forest Labs
3M runs

Google's fast image generation model with conversational editing, multi-image fusion, and character consistency
12.8M runs
Official models
Official models are always on, maintained, and have predictable pricing.

qwen / qwen3-7-plus
Qwen3.7-Plus is Alibaba's cost-effective multimodal model with vision-language understanding, a 1 million token context window, and strong agentic coding and tool use.
bytedance / seedance-2.0-mini
A lower-cost variant of Seedance 2.0 for high-volume video generation with multimodal inputs and native audio.
alibaba / happyhorse-1.1
Alibaba's Happy Horse 1.1 generates videos from text, animates a single image, or builds a video from multiple reference images. Supports 720p and 1080p, 3-15 second durations, and five aspect ratios.

sourceful / riverflow-v2.5-pro
Top-quality agentic image model with multi-step reasoning, candidate scoring, and adjustable thinking effort

sourceful / riverflow-v2.5-fast
Speed-optimized variant of Riverflow 2.5 for production and latency-sensitive workflows

luma / ray-3.2
Luma's reasoning video model. Generates cinematic 5s or 10s video from text or images, with native HDR and EXR export for professional production pipelines.

anthropic / claude-fable-5
Claude Fable 5 from Anthropic: the next generation of intelligence for the hardest knowledge work and coding problems.

ideogram-ai / ideogram-v4-quality
The highest quality Ideogram v4 model. v4 creates images with stunning realism, creative designs, and consistent styles

ideogram-ai / ideogram-v4-balanced
Balance speed, quality and cost. Ideogram v4 creates images with stunning realism, creative designs, and consistent styles
runwayml / aleph-2
Edit one frame to update an entire video. Aleph 2.0 is Runway's in-context video editor: longer clips (up to 30s), multi-shot edits, and image-level precision via keyframe references.
xai / grok-imagine-video-1.5
Image-to-video with synchronized audio using xAI's Grok Imagine Video 1.5 preview model

krea / krea-2-large
Krea's flagship foundation image model. Larger and more flexible than Krea 2 Medium, with particular strength in photorealism and expressive artistic styles.

krea / krea-2-medium
Foundation image model from Krea, tuned for expressive illustration, anime, and painterly styles. Fast and consistent across artistic directions.

anthropic / claude-sonnet-4.6
Claude Sonnet 4.6 from Anthropic: a full upgrade to coding, computer use, long-context reasoning, agent planning, knowledge work, and design, with a 1 million token context window in beta.
bytedance / video-upscaler
Upscale and enhance video up to 4K at 60fps, with scene-aware presets for AI-generated content, short dramas, UGC, and film restoration.

google / gemini-3.5-flash
Google's fast multimodal model with frontier reasoning across agents, coding, and long-context tasks

heygen / avatar-v
Create realistic talking avatar videos from text with HeyGen's Avatar V engine — the newest, highest-quality avatar engine with cross-reference-driven animation.

ibm-granite / granite-vision-4.1-4b
Granite Vision 4.1 4B is a vision-language model (VLM) that delivers frontier-level performance on structured document extraction tasks — chart extraction, table extraction, and semantic key-value pair extraction — in a compact 4B parameter footprint
prunaai / p-video-animate
p-video-animate animates a reference image with the motion and audio of a source video. Optimized for speed and cost — 5.24s per 1s of video.

recraft-ai / recraft-v4.1-utility-pro
A faster, lighter Recraft image generation model at ~2048px resolution, optimized for high-volume production. Design taste and prompt accuracy at high resolution with better throughput.
I want to…
View all collectionsGenerate images
Use AI to generate images & photos with an API
Caption videos
Use AI to understand, describe, and caption videos with an API
Generate speech
Use AI for text-to-speech or to clone your voice via API
Generate images from a face
Use AI to generate images from a face with an API
Generate videos
Use AI to generate videos with an API
Upscale images with super resolution
Use AI to upscale and enhance images with an API
Generate music
Use AI to generate music with an API
Edit any image
Use AI to edit any image via API
Transcribe speech to text
Use AI to transcribe speech to text with an API
OCR to extract text from images
Use AI For Optical Character Recognition (OCR) to extract text from images via API
Remove backgrounds
Use AI to remove backgrounds from images and videos with an API
FLUX family of models
FLUX AI models by Black Forest Labs: image generation & editing via API
Restore images
Use AI to restore images via API
Enhance videos
Use AI to upscale, restore, extend, and enhance videos with an API
Detect NSFW content
Detect NSFW content in images and text
Classify text
Classify text by sentiment, topic, intent, or safety
Speaker diarization
Identify speakers from audio and video inputs
Create realistic face swaps
Replace faces across images with natural-looking results.
Turn sketches into images
Transform rough sketches into polished visuals
Generate emojis
Generate custom emojis from text or images
Generate anime-style images and videos
Create anime-style characters, scenes, and animations
Generate videos from images
Use AI to generate videos from images with an API
Vision models
Chat with images — visual Q&A, analysis, and reasoning via API
Caption Images
Use AI to generate captions and descriptions from images with an API
Edit your videos
Use AI to edit, restyle, extend, and remix videos with an API
WAN family of models
WAN family of models: open-source video, image, and audio generation
Create 3D content
Generate 3D objects, meshes, and textures from text or images with an API
Official models
Official models are always on, predictably priced, and have a stable API.
Large Language Models (LLMs)
Explore Large Language Models (LLMs) for chat, generation & NLP tasks via API
Try AI models for free
Try AI Models for free: video generation, image generation, upscaling, and photo restoration
Lipsync videos
Use AI to generate lipsync videos with an API
Control image generation
Use AI to control image generation with an API
Embedding models
Embedding models for AI search and analysis
Object detection and segmentation
Use AI object detection and segmentation models to distinguish objects in images & videos
Flux fine-tunes
Flux fine-tunes: build and run custom AI image models via API
Kontext fine-tunes
Kontext fine-tunes: Build custom AI image models with an API
Create songs with voice cloning
Create songs with voice cloning models via API
Media utilities
AI media utilities: auto-caption, watermark, frame extraction & more via API
Qwen-Image fine-tunes
Browse the diverse range of qwen-image fine-tunes the community has custom-trained on Replicate.
Latest models

qwen / qwen3-7-plus
Qwen3.7-Plus is Alibaba's cost-effective multimodal model with vision-language understanding, a 1 million token context window, and strong agentic coding and tool use.
14 runs

ultralytics / yolo26-sem
Ultralytics YOLO26 semantic segmentation (Cityscapes), selectable size n/s/m/l/x.
2 runs

ultralytics / yolo26-pose
Ultralytics YOLO26 pose estimation (COCO-Pose), selectable size n/s/m/l/x.
3 runs

ultralytics / yolo26-obb
Ultralytics YOLO26 oriented bounding box detection (DOTAv1), selectable size n/s/m/l/x.
3 runs

ultralytics / yolo26-seg
Ultralytics YOLO26 instance segmentation (COCO-Seg), selectable size n/s/m/l/x.
7 runs

ultralytics / yolo26-cls
Ultralytics YOLO26 image classification (ImageNet), selectable size n/s/m/l/x.
2 runs

bytedance / seedance-2.0-mini
A lower-cost variant of Seedance 2.0 for high-volume video generation with multimodal inputs and native audio.
873 runs

ultralytics / yolo26
Ultralytics YOLO26 object detection (COCO), selectable size n/s/m/l/x.
8 runs


csviverdeia / locateanything-3b-h100
LocateAnything-3B em H100 (Hopper) — visual grounding imagem+video (boxes/points). Detecção densa, pointing, tracker none/sort/reid. PRINCIPAL (rápido).
58 runs
alibaba / happyhorse-1.1
Alibaba's Happy Horse 1.1 generates videos from text, animates a single image, or builds a video from multiple reference images. Supports 720p and 1080p, 3-15 second durations, and five aspect ratios.
754 runs


mertakan-trem / dafwadawdad
36 runs


jilin-dbgr / icetea
4 runs




