Overview
Stable Diffusion web UI is a browser interface for running Stable Diffusion models on your own hardware, built by AUTOMATIC1111 using the Gradio library. Instead of writing code, you type prompts and adjust settings in a web page that runs locally, then generate images on your own GPU.
It is aimed at people who want hands-on control over image generation: artists, hobbyists, and developers who prefer to run models locally rather than through a hosted service. The interface covers text-to-image and image-to-image generation, inpainting, outpainting, and upscaling, and supports add-ons like LoRA, textual inversion embeddings, and hypernetworks.
As an image-generation tool in the multimodal space, it acts as a front end for Stable Diffusion checkpoints you supply. It also exposes an API so you can drive generation from other programs, and supports community extensions and custom scripts for extra features.
What it does
- Text-to-image (txt2img) and image-to-image (img2img) generation, plus inpainting and outpainting
- Prompt controls including negative prompts, attention weighting, styles, and prompt editing mid-generation
- Built-in upscalers (RealESRGAN, ESRGAN, SwinIR, LDSR) and face restoration with GFPGAN and CodeFormer
- Support for LoRAs, textual inversion embeddings, hypernetworks, and loading checkpoints in safetensors format
- Highres Fix, X/Y/Z plots, checkpoint merging, and a training tab for embeddings and hypernetworks
- An API for programmatic access and community extensions and custom scripts
Getting started
You need Python 3.10.6 and git installed first; newer Python versions are not supported. The repo ships launch scripts that set up a virtual environment and download dependencies on first run.
Install prerequisites
Install Python 3.10.6 (on Windows, check "Add Python to PATH") and git. On Debian-based Linux, install the system packages first.
sudo apt install wget git python3 python3-venv libgl1 libglib2.0-0Clone the repository
Download the project source with git.
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.gitRun the web UI
On Windows, run webui-user.bat as a normal (non-administrator) user. On Linux, run webui.sh. The first run creates a virtual environment and downloads dependencies, then prints a local URL to open in your browser.
./webui.shCommands and code are distilled from the project's own documentation — always check the official repo for the latest.
When to use it
- Generate images from text prompts locally on your own GPU without sending data to a hosted service
- Edit and extend existing images with inpainting, outpainting, and image-to-image
- Upscale and restore faces in generated or existing images using the built-in upscalers
- Experiment with custom checkpoints, LoRAs, and embeddings, or drive generation through the API
How Stable Diffusion web UI (AUTOMATIC1111) compares
Stable Diffusion web UI (AUTOMATIC1111) alongside other open-source image generation tools AI/TLDR tracks, ranked by GitHub stars.
| Tool | Stars | What it does |
|---|---|---|
| Stable Diffusion web UI (AUTOMATIC1111) | ★ 164k | A browser interface for running Stable Diffusion image generation on your own machine |
| ComfyUI | ★ 118k | A node-based visual editor for building and running image and video generation pipelines like Stable Diffusion and FLUX locally. |
| Fooocus | ★ 50.4k | A simplified image generation app built on Stable Diffusion that hides technical settings for easy prompting. |
| InvokeAI | ★ 27.5k | A self-hosted creative tool and canvas for generating and editing images with open diffusion models. |
| Stability-AI generative-models | ★ 27.2k | Stability AI's official code for its Stable Diffusion family of image and video generation models. |
| FLUX | ★ 25.6k | Black Forest Labs' open-weight diffusion models and inference code for generating and editing images from text prompts. |
| Z-Image | ★ 11.6k | Alibaba Tongyi's 6B-parameter open image model that produces photorealistic images quickly on a single GPU. |
| DALLE2-pytorch | ★ 11.3k | An open implementation of DALL-E 2 in PyTorch, with the CLIP encoder, diffusion prior, and cascading decoder you train to generate images from text. |