Skip to main content
llm.info

Stable Diffusion XL

Open Weights

Stability AI

Stability AI's flagship open-source text-to-image generation model. Features 3.5B parameter base model with 6.6B parameter refiner in ensemble pipeline. Native 1024x1024 resolution (2x larger than SD 1.5) with improved generation for limbs, text, faces, and overall image quality. Uses dual CLIP networks (CLIP1 + CLIP2) for superior semantic understanding vs single CLIP. Achieves 89% prompt adherence vs SD 1.5's 71%. Supports image-to-image, inpainting, and outpainting workflows. Runs on consumer hardware (RTX 3060+ with 8GB VRAM).

Strengths

Caveats

Capabilities

Vision
Audio
Video
Tool Use

Resources

No external resources available

Reviews

Comments