Explore – Replicate

Explore

Featured models

Foundation image model from Krea, tuned for expressive illustration, anime, and painterly styles. Fast and consistent across artistic directions.

8.4K runs

Official

alibaba/happyhorse-1.0

Alibaba's Happy Horse 1.0 generates videos from text prompts or animates a single image into video. Supports 720p and 1080p, 3-15 second durations, and five aspect ratios.

25.2K runs

Official

openai/gpt-image-2

OpenAI's state-of-the-art image generation model. Create and edit images from text with strong instruction following, sharp text rendering, and detailed editing.

9.7M runs

Official

anthropic/claude-opus-4.7

Anthropic's most capable model with a step-change improvement in agentic coding, better vision, and stronger multi-step reasoning

116.8K runs

Official

google/gemini-3.1-flash-tts

Google's fast, expressive text-to-speech model with 30 voices and 70+ language support

205.9K runs

Official

minimax/music-2.6

Generate full-length songs or instrumentals from a text prompt, with optional auto-generated lyrics

14.7K runs

Official

bytedance/seedance-2.0

ByteDance's multimodal video generation model with native audio, multimodal reference inputs, and intelligent duration control.

876.4K runs

Official

prunaai/p-video-avatar

p-video-avatar is the fastest and cheapest avatar/lipsync video model on the market.

77.8K runs

Official

bytedance/seedream-5-lite

Seedream 5.0 lite: image generation with built-in reasoning, example-based editing, and deep domain knowledge

2.7M runs

Official

xai/grok-imagine-video

Generate videos using xAI's Grok Imagine Video model

1.2M runs

Official

black-forest-labs/flux-2-max

The highest fidelity image model from Black Forest Labs

3M runs

Official

google/nano-banana-2

Google's fast image generation model with conversational editing, multi-image fusion, and character consistency

12.8M runs

Official

Official models

Official models are always on, maintained, and have predictable pricing.

View all official models

qwen / qwen3-7-plus

Qwen3.7-Plus is Alibaba's cost-effective multimodal model with vision-language understanding, a 1 million token context window, and strong agentic coding and tool use.

14 runs

Official

bytedance / seedance-2.0-mini

A lower-cost variant of Seedance 2.0 for high-volume video generation with multimodal inputs and native audio.

873 runs

Official

alibaba / happyhorse-1.1

Alibaba's Happy Horse 1.1 generates videos from text, animates a single image, or builds a video from multiple reference images. Supports 720p and 1080p, 3-15 second durations, and five aspect ratios.

754 runs

Official

sourceful / riverflow-v2.5-pro

Top-quality agentic image model with multi-step reasoning, candidate scoring, and adjustable thinking effort

472 runs

Official

sourceful / riverflow-v2.5-fast

Speed-optimized variant of Riverflow 2.5 for production and latency-sensitive workflows

676 runs

Official

luma / ray-3.2

Luma's reasoning video model. Generates cinematic 5s or 10s video from text or images, with native HDR and EXR export for professional production pipelines.

1.7K runs

Official

anthropic / claude-fable-5

Claude Fable 5 from Anthropic: the next generation of intelligence for the hardest knowledge work and coding problems.

1.9K runs

Official

ideogram-ai / ideogram-v4-quality

The highest quality Ideogram v4 model. v4 creates images with stunning realism, creative designs, and consistent styles

11.1K runs

Official

ideogram-ai / ideogram-v4-balanced

Balance speed, quality and cost. Ideogram v4 creates images with stunning realism, creative designs, and consistent styles

4.5K runs

Official

runwayml / aleph-2

Edit one frame to update an entire video. Aleph 2.0 is Runway's in-context video editor: longer clips (up to 30s), multi-shot edits, and image-level precision via keyframe references.

919 runs

Official

xai / grok-imagine-video-1.5

Image-to-video with synchronized audio using xAI's Grok Imagine Video 1.5 preview model

92.8K runs

Official

krea / krea-2-large

Krea's flagship foundation image model. Larger and more flexible than Krea 2 Medium, with particular strength in photorealism and expressive artistic styles.

1.7K runs

Official

krea / krea-2-medium

Foundation image model from Krea, tuned for expressive illustration, anime, and painterly styles. Fast and consistent across artistic directions.

8.4K runs

Official

anthropic / claude-sonnet-4.6

Claude Sonnet 4.6 from Anthropic: a full upgrade to coding, computer use, long-context reasoning, agent planning, knowledge work, and design, with a 1 million token context window in beta.

11K runs

Official

bytedance / video-upscaler

Upscale and enhance video up to 4K at 60fps, with scene-aware presets for AI-generated content, short dramas, UGC, and film restoration.

5.3K runs

Official

google / gemini-3.5-flash

Google's fast multimodal model with frontier reasoning across agents, coding, and long-context tasks

105.6K runs

Official

heygen / avatar-v

Create realistic talking avatar videos from text with HeyGen's Avatar V engine — the newest, highest-quality avatar engine with cross-reference-driven animation.

233 runs

Official

ibm-granite / granite-vision-4.1-4b

Granite Vision 4.1 4B is a vision-language model (VLM) that delivers frontier-level performance on structured document extraction tasks — chart extraction, table extraction, and semantic key-value pair extraction — in a compact 4B parameter footprint

11.6K runs

Official

prunaai / p-video-animate

p-video-animate animates a reference image with the motion and audio of a source video. Optimized for speed and cost — 5.24s per 1s of video.

5.8K runs

Official

recraft-ai / recraft-v4.1-utility-pro

A faster, lighter Recraft image generation model at ~2048px resolution, optimized for high-volume production. Design taste and prompt accuracy at high resolution with better throughput.

1.4K runs

Official

I want to…

View all collections

Generate images

Use AI to generate images & photos with an API

Caption videos

Use AI to understand, describe, and caption videos with an API

Generate speech

Use AI for text-to-speech or to clone your voice via API

Generate images from a face

Use AI to generate images from a face with an API

Generate videos

Use AI to generate videos with an API

Upscale images with super resolution

Use AI to upscale and enhance images with an API

Generate music

Use AI to generate music with an API

Edit any image

Use AI to edit any image via API

Transcribe speech to text

Use AI to transcribe speech to text with an API

OCR to extract text from images

Use AI For Optical Character Recognition (OCR) to extract text from images via API

Remove backgrounds

Use AI to remove backgrounds from images and videos with an API

FLUX family of models

FLUX AI models by Black Forest Labs: image generation & editing via API

Restore images

Use AI to restore images via API

Enhance videos

Use AI to upscale, restore, extend, and enhance videos with an API

Detect NSFW content

Detect NSFW content in images and text

Classify text

Classify text by sentiment, topic, intent, or safety

Speaker diarization

Identify speakers from audio and video inputs

Create realistic face swaps

Replace faces across images with natural-looking results.

Turn sketches into images

Transform rough sketches into polished visuals

Generate emojis

Generate custom emojis from text or images

Generate anime-style images and videos

Create anime-style characters, scenes, and animations

Generate videos from images

Use AI to generate videos from images with an API

Vision models

Chat with images — visual Q&A, analysis, and reasoning via API

Caption Images

Use AI to generate captions and descriptions from images with an API

Edit your videos

Use AI to edit, restyle, extend, and remix videos with an API

WAN family of models

WAN family of models: open-source video, image, and audio generation

Create 3D content

Generate 3D objects, meshes, and textures from text or images with an API

Official models

Official models are always on, predictably priced, and have a stable API.

Large Language Models (LLMs)

Explore Large Language Models (LLMs) for chat, generation & NLP tasks via API

Try AI models for free

Try AI Models for free: video generation, image generation, upscaling, and photo restoration

Lipsync videos

Use AI to generate lipsync videos with an API

Control image generation

Use AI to control image generation with an API

Embedding models

Embedding models for AI search and analysis

Object detection and segmentation

Use AI object detection and segmentation models to distinguish objects in images & videos

Flux fine-tunes

Flux fine-tunes: build and run custom AI image models via API

Kontext fine-tunes

Kontext fine-tunes: Build custom AI image models with an API

Create songs with voice cloning

Create songs with voice cloning models via API

Media utilities

AI media utilities: auto-caption, watermark, frame extraction & more via API

Qwen-Image fine-tunes

Browse the diverse range of qwen-image fine-tunes the community has custom-trained on Replicate.

Latest models

qwen / qwen3-7-plus

Qwen3.7-Plus is Alibaba's cost-effective multimodal model with vision-language understanding, a 1 million token context window, and strong agentic coding and tool use.

14 runs

Official

ultralytics / yolo26-sem

Ultralytics YOLO26 semantic segmentation (Cityscapes), selectable size n/s/m/l/x.

2 runs

ultralytics / yolo26-pose

Ultralytics YOLO26 pose estimation (COCO-Pose), selectable size n/s/m/l/x.

3 runs

ultralytics / yolo26-obb

Ultralytics YOLO26 oriented bounding box detection (DOTAv1), selectable size n/s/m/l/x.

3 runs

ultralytics / yolo26-seg

Ultralytics YOLO26 instance segmentation (COCO-Seg), selectable size n/s/m/l/x.

7 runs

ultralytics / yolo26-cls

Ultralytics YOLO26 image classification (ImageNet), selectable size n/s/m/l/x.

2 runs

bytedance / seedance-2.0-mini

A lower-cost variant of Seedance 2.0 for high-volume video generation with multimodal inputs and native audio.

873 runs

Official

ultralytics / yolo26

Ultralytics YOLO26 object detection (COCO), selectable size n/s/m/l/x.

8 runs

csviverdeia / locateanything-3b-h100

LocateAnything-3B em H100 (Hopper) — visual grounding imagem+video (boxes/points). Detecção densa, pointing, tracker none/sort/reid. PRINCIPAL (rápido).

58 runs

alibaba / happyhorse-1.1

754 runs

Official

mertakan-trem / dafwadawdad

36 runs

jilin-dbgr / icetea

4 runs

Sunbelt Computer Software

PL/B Language Development and Support

Explore

FLUX.2 [pro]

Black Forest Labs' most advanced image generation model yet.

How to prompt Grok Imagine Video 1.5

How to make remarkable videos with Seedance 2.0

How to prompt Seedream 5.0

Featured models

Official models

qwen / qwen3-7-plus

bytedance / seedance-2.0-mini

alibaba / happyhorse-1.1

sourceful / riverflow-v2.5-pro

sourceful / riverflow-v2.5-fast

luma / ray-3.2

anthropic / claude-fable-5

ideogram-ai / ideogram-v4-quality

ideogram-ai / ideogram-v4-balanced

runwayml / aleph-2

xai / grok-imagine-video-1.5

krea / krea-2-large

krea / krea-2-medium

anthropic / claude-sonnet-4.6

bytedance / video-upscaler

google / gemini-3.5-flash

heygen / avatar-v

ibm-granite / granite-vision-4.1-4b

prunaai / p-video-animate

recraft-ai / recraft-v4.1-utility-pro

I want to…

Latest models

How to prompt Grok Imagine Video 1.5

How to make remarkable videos with Seedance 2.0

How to prompt Seedream 5.0