Stable Diffusion XL
Open Weights
Stability AI
Stability AI's flagship open-source text-to-image generation model. Features 3.5B parameter base model with 6.6B parameter refiner in ensemble pipeline. Native 1024x1024 resolution (2x larger than SD 1.5) with improved generation for limbs, text, faces, and overall image quality. Uses dual CLIP networks (CLIP1 + CLIP2) for superior semantic understanding vs single CLIP. Achieves 89% prompt adherence vs SD 1.5's 71%. Supports image-to-image, inpainting, and outpainting workflows. Runs on consumer hardware (RTX 3060+ with 8GB VRAM).
Strengths
Caveats
Capabilities
Vision
Audio
Video
Tool Use
Resources
No external resources available