What Is Stable Diffusion | Local Run, License, Ecosystem

Q: Can I use it commercially?

Depends on the version. SD 1.5 and SDXL are fully open (CreativeML Open RAIL-M, no revenue cap). SD 3, SD 3.5, and FLUX.1 dev are free for commercial use under $1M in annual revenue; above that you need a contract with Stability AI or Black Forest Labs. Selling the generated images themselves is unlimited on all versions.

Q: Which version should I start with?

SDXL 1.0 is the safest start today. Runs in 8–12GB VRAM, has a huge LoRA library on Civitai, has no commercial revenue cap, and the ecosystem is mature. For top quality go to FLUX.1 dev (recommended 16GB+ VRAM). SD 1.5 is light but a generation behind on quality — likely to leave new users wanting more.

What Is Stable Diffusion — Open-Source Image AI: How It Works, Running Locally, and Commercial Licensing

Contents

1. August 22, 2022 — The Day Image AI Became Something You Could Download
2. What Is Stable Diffusion — In Three Lines
3. Version Lineage — SD1.5 / SDXL / SD3.5 and the FLUX Split
4. The Reality of Running It Locally — By VRAM Tier
5. The License Trap — Lessons from the SD3 Backlash
6. Civitai / LoRA / ComfyUI — An Ecosystem Bigger Than the Model
7. Midjourney vs Stable Diffusion — Which to Pick
8. Three Pitfalls — Copyright, NSFW, Compatibility
Summary
FAQ

On August 22, 2022, the London startup Stability AI released the weight file for an image generation model called Stable Diffusion v1.4. A single 4GB `.ckpt` file. The moment it hit GitHub and Hugging Face, "image generation AI" went from something behind the cloud to software you could download to your own PC. Neither Midjourney nor DALL·E 2 would do that at the time.

Almost four years on, Stable Diffusion has reached SD 3.5 Large (8.1 billion parameters), and Civitai hosts over 100,000 custom models and LoRAs. Meanwhile, the licensing blowback around SD3's release caused a developer exodus, which gave birth to FLUX — built by the original SD team's new company, Black Forest Labs — and FLUX has overtaken the parent in quality. The picture is no longer simple.

My stance up front. If "Midjourney is fine" works for you, don't force yourself into Stable Diffusion. But if any of these apply — "I want to keep the same character consistent across 100 images," "I want to mix in my own confidential data locally," "I want my monthly cost to be $0," "I need an open model I can disclose for commercial work" — then SD is unavoidable. This article covers how SD works, its version history, hardware requirements, licensing, ecosystem, and how to choose, all as of May 2026.

Stable Diffusion · Open-Source Image AI

Four Things That Make It Different

— What Midjourney, DALL·E, and Firefly will never give you

① OPEN WEIGHTS

Weight files are distributed

Download .safetensors directly from Hugging Face. Midjourney doesn't even expose an API

② LOCAL FIRST

Runs on your own GPU

Practical from RTX 3060 (12GB) up. Generated data stays on your machine

③ FINE-TUNE

Modify freely with LoRA

100,000+ LoRAs and custom models on Civitai — anime, photoreal, specific characters, anything

④ ZERO COST

Free beyond electricity

After the upfront GPU, every image is $0. Commercial use is also OK with conditions

In other words, this is the image AI for people who want freedom from cloud dependence, black boxes, and monthly subscriptions.
The price you pay in return: a GPU, setup time, and prompt trial-and-error.

1. August 22, 2022 — The Day Image AI Became Something You Could Download

At the time, the image generation AI scene was a two-horse race: OpenAI's DALL·E 2 (invite-only beta) and Midjourney V3 (Discord-only). Both were cloud-only, and both kept their weights completely hidden. What their AI learned, how it ran, what it could and couldn't generate — all of it was at the vendor's discretion.

Then Stability AI made a choice nobody expected: release the weight file itself. A diffusion model trained on LAION-5B (5.8 billion image-text pairs), inference code under MIT, weights under CreativeML Open RAIL-M (commercial use OK, almost completely free). Within a week, engineers worldwide had it running in Google Colab, a local WebUI (later AUTOMATIC1111) was born, Civitai launched — and the personalization of AI art took off.

The remarkable thing wasn't the technical leap so much as the precedent: "image generation AI is something individuals can own and modify." If you want an LLM analogy, the shock was close to Llama 2 and Llama 3 dropping with "commercial use OK." Ever since, the image AI industry has run two parallel tracks: "closed and high quality" (MJ/DALL·E) and "open and freely customizable" (the SD family).

2. What Is Stable Diffusion — In Three Lines

Stable Diffusion is an open-weight, diffusion-model-based image generation AI released by Stability AI. Three-line breakdown:

① HOW IT WORKS

Starts from a random noise image, then gradually denoises it to match your text prompt. Takes 20–50 steps

② ARCHITECTURE

A three-part stack: Text Encoder (CLIP/T5) that interprets the prompt, U-Net/DiT that does the denoising, and a VAE that compresses/decompresses the image

③ DISTRIBUTION

Weight files (.safetensors, 2GB–16GB) are freely downloadable from Hugging Face. Run them on a local GPU or via cloud inference services

The thing I think actually matters is what "diffusion model" means in plain terms. In the GAN era (StyleGAN and friends), a generator and a discriminator fought each other to produce images. Diffusion models took a different path: "start from a noisy image and gradually subtract noise." A simpler idea — but it turned out to produce far more stable, high-resolution output than GANs. That insight is the core of SD's success, and almost every image AI since (Imagen, DALL·E 3, FLUX) is also a diffusion model.

3. Version Lineage — SD1.5 / SDXL / SD3.5 and the FLUX Split

The most confusing thing about SD's history is "which version should I actually use?" Each generation differs in performance, license, recommended GPU, and LoRA ecosystem. Let's lay it out.

Version	Released	Parameters	Recommended VRAM	Characteristics
SD 1.5	Oct 2022	0.9B	4–8GB	Lightest, most LoRAs, strongest on anime. Still mainstream on Civitai
SD 2.x	Nov 2022	0.9B	6–8GB	Effectively skip. Reduced training data, poor reception, never caught on
SDXL 1.0	Jul 2023	3.5B	8–12GB	1024×1024 standard. The go-to for photoreal and commercial design. Second-largest LoRA pool
SD 3 Medium	Jun 2024	2B	8–12GB	License blowback caused developer exodus. Widely seen as a failure
SD 3.5 Medium	Oct 2024	2.5B	9.9GB	Redemption for SD3. MMDiT-X architecture, designed for consumer PCs
SD 3.5 Large	Oct 2024	8.1B	18GB (11GB in FP8)	The flagship quality. Aimed at RTX 4090 class
FLUX.1 dev	Aug 2024	12B	12–24GB	From Black Forest Labs, founded by ex-SD developers. Widely rated above SD itself

Bottom line: if you're starting today, it's a two-way pick between SDXL and FLUX.1 dev. SD 1.5 is light and has the most LoRAs, but it's a generation behind on quality. SD 3.5 Large is heavy yet pushed around by FLUX. The practical sorting is: SDXL for commercial design, FLUX for top quality, SD 3.5 Medium for the lightest viable local setup.

FLUX's arrival has an ironic backstory. After the SD3 licensing fiasco (more below), much of the original SD team left Stability AI, set up Black Forest Labs in Germany, and launched FLUX.1. "A higher-quality SD successor" — coming from the people who built SD in the first place. From the community's perspective, plenty of people now see FLUX as the rightful heir rather than the parent.

4. The Reality of Running It Locally — By VRAM Tier

"Runs locally" is one thing; what your specific PC can actually do is another. Here's what I've seen in practice.

4–6GB (GTX 1660 / RTX 3050)

Barely-works tier

SD 1.5 only. 20–60 sec per image. SDXL and above are rough

8GB (RTX 3060 Ti / 4060)

Minimum practical line

SDXL runs with memory optimization. 15–30 sec per 1024px image

12GB (RTX 3060 12GB / 4070)

Comfortable tier

SDXL/SD 3.5 Medium with headroom. Stack LoRAs freely. 5–15 sec per image

16–24GB (RTX 4080 / 4090)

Serious production setup

FLUX/SD 3.5 Large with headroom. You can train your own LoRAs. 2–8 sec per image

Note: 16GB+ system RAM and 100GB+ of free SSD space are also needed. Mac runs via Apple Silicon's MPS but is 3–5× slower than NVIDIA

No sugarcoating: if you want to seriously touch SD today, the realistic entry points are an RTX 3060 12GB (around $200 used) or an RTX 4070 (around $600 new). 8GB GPUs work, but you're walking into a swamp of optimization flags and quantization — not what I'd recommend to a beginner. If you don't want to buy a GPU, the right move is cloud inference services (Runpod / Replicate / Civitai's own hosting) at roughly $0.001–$0.01 per image.

5. The License Trap — Lessons from the SD3 Backlash

"It's open source, so commercial use is fine" is not the simple statement people want it to be with SD. The license depends on the version.

SD 1.5 / SDXL

CreativeML Open RAIL-M

No revenue cap. Commercial use is almost entirely free. Only restrictions concern illegal or harmful use

SD 3 / SD 3.5 / FLUX.1 dev

Community License (with $1M revenue cap)

Individuals and organizations under $1M in annual revenue can use it commercially. Above that, an Enterprise contract is required

Individual bloggers, freelancers, and early-stage startups are all clear. A commercial agreement is only needed when a large enterprise embeds it in a product. Selling the generated images themselves is unlimited — no matter how many you generate or sell, you owe Stability AI nothing

When SD 3 dropped in June 2024, its license was so harsh — usage-based fees per generated image, a ban on Civitai distribution of derivatives — that Civitai publicly refused to host SD3 derivatives. The community declared "SD is dead," many developers walked to Black Forest Labs and shipped FLUX. Stability AI massively loosened the terms when SD 3.5 launched in October (the current $1M revenue version), but as of May 2026, community trust has not fully recovered.

Practical advice: "Just use SDXL" is the version that bites least. CreativeML Open RAIL-M means no revenue cap, the LoRA pool is huge, and the ecosystem is mature. Move to SD 3.5 or FLUX only when SDXL stops being enough.

6. Civitai / LoRA / ComfyUI — An Ecosystem Bigger Than the Model

Talking about Stable Diffusion as "just the model" misses the point. SD's strength is the surrounding ecosystem.

Civitai

Model distribution hub

100,000+ checkpoints, LoRAs, embeddings. Anime, photoreal, specific characters, specific poses — anything

LoRA

Add-on training file

Small 50–300MB files that add a style or character to a base model. Stack them to combine effects

ComfyUI

Node-based UI

The pro's choice. Build complex workflows visually (ControlNet → upscale → Inpaint chains, etc.)

A1111

Beginner-friendly WebUI

AUTOMATIC1111's project. Form-based and intuitive. How most SD users first got in

ControlNet

Composition control

Specify composition with a pose image, line drawing, or depth map. Midjourney has no equivalent at this precision

IP-Adapter

Image reference

Copy a reference image's style, face, or outfit onto a new image. Essential for character consistency

One caveat. SD 1.5 LoRAs don't load on SDXL; SDXL LoRAs don't load on FLUX. Each base model is its own ecosystem. If the LoRAs you love on Civitai are all SD 1.5, switching to SDXL means abandoning them. When searching on Civitai, always check the "Base Model" filter. To understand how these add-ons actually work, see what LoRA is.

7. Midjourney vs Stable Diffusion — Which to Pick

People often ask "which is better, SD or Midjourney/DALL·E?" — but that's the wrong axis. Go with Midjourney for quality, go with SD for freedom and ownership. Different roles entirely.

Aspect	Midjourney V8	Stable Diffusion (SDXL/FLUX)
Ease of use	◎ Just write the prompt	△ Setup required
Default quality	◎ Best artistic look in the industry	○ Depends on model (FLUX is on par)
Composition control	△ Prompt only	◎ Full control via ControlNet
Character consistency	○ Character Reference	◎ Train a LoRA, replicate perfectly
Monthly cost	$10–$120	$0 (local) or pay-per-use
Commercial use	OK on paid plans	SDXL unlimited; SD3.5/FLUX has $1M cap
Data privacy	× Cloud-bound	◎ Can stay local end-to-end
Learning curve	Hours	Days to weeks

The clean read: for "make a single pretty image," Midjourney. $10/month and no setup hell. For "I want 100 images of the same character," "I want to mix in proprietary data," "I want a commercial flat-rate at any volume," or "I want to reproduce a specific anime style," Stable Diffusion. Neither is "better." Plenty of pros use both (an illustrator I know roughs out composition in MJ and finishes in SD).

8. Three Pitfalls — Copyright, NSFW, Compatibility

Three things you'll hit using SD that are worth knowing up front.

Pitfall ①: Training-data copyright risk

SD's base models are trained on LAION-5B (5.8 billion images scraped from the internet). Inevitably, copyrighted works are in there in large numbers. Getty Images is currently suing Stability AI (filed 2023, ongoing in both US and UK), and "specific artist style" LoRAs on Civitai have gotten visibly greyer since 2025. For commercial work, minimum hygiene: don't prompt by specific artist names, and even on Civitai LoRAs, avoid public figures or works modeled on identifiable copyright holders. If "commercial safety" is non-negotiable, Adobe Firefly is the alternative.

Pitfall ②: NSFW generation is trivially easy

Because SD has open weights, disabling the SafetyChecker means sexual or violent images are easy to generate. Civitai openly hosts many NSFW models. The technology itself is neutral, but creation or distribution of generated content involving minors is illegal in many countries (Japan currently has legislation under discussion). Never do this on a work PC during work hours — logs and network traffic make it trivial to spot. Even on a home PC, certain categories are illegal to create or even store. Self-awareness is mandatory.

Pitfall ③: Generational compatibility splits

As covered above, SD1.5 / SDXL / SD3.5 / FLUX are each their own ecosystem. LoRAs, embeddings, and ControlNet models don't cross-load. "Let me upgrade to SDXL" can mean discovering 50 SD1.5 LoRAs you can't use anymore. If you're starting out, pick one (SDXL or FLUX) and stay within that ecosystem — it's actually more efficient in the long run.

Summary

Essence

The revolution that turned image AI into "software individuals can own and modify." Provides freedoms MJ/DALL·E don't

Entry point

RTX 3060 12GB + SDXL + A1111 is the realistic start. No GPU? Use Runpod from $0.001/image

Use which

Most people: Midjourney. Choose SD only if you need "100 of the same character," "private data," or "electricity-only costs"

Caution

Copyright, NSFW, and compatibility splits are the three things to know early. Start commercial work on SDXL (no revenue cap)

Stable Diffusion changed the world in 2022. But in 2026, "just use SD" is no longer the default answer — Midjourney V8 wins on raw quality, Adobe Firefly wins on commercial safety. The reason SD hasn't died — and in fact has gained momentum with FLUX — is that it remains the only option for "use image AI on your own PC, with your own data, exactly the way you want, without depending on any cloud company." Midjourney can lock you out of Discord; OpenAI can change its terms of service; the SD weight file on your SSD is yours. For people who feel safer that way, SD will keep being a special tool.

FAQ

Is Stable Diffusion free?

The model itself (weight files) is free to download and use. You do need a GPU to run it — at minimum an RTX 3060 12GB (around $200) — or a cloud inference service (Runpod runs about $0.4/hour). You owe Stability AI no monthly fee.

Can I use it commercially?

Depends on the version. SD 1.5 and SDXL are fully open (CreativeML Open RAIL-M, no revenue cap). SD 3, SD 3.5, and FLUX.1 dev are free for commercial use under $1M in annual revenue; above that you need a contract with Stability AI or Black Forest Labs. Selling the generated images themselves is unlimited on all versions.

Which is better, Midjourney or SD?

Depends on use. If you just want one pretty image from a prompt, Midjourney is far simpler and the quality is excellent. If you need to mass-produce the same character, mix in proprietary data, drive cost down to electricity, or replicate a specific anime style, only Stable Diffusion works. Plenty of pros use both.

Which version should I start with?

SDXL 1.0 is the safest start today. Runs in 8–12GB VRAM, has a huge LoRA library on Civitai, has no commercial revenue cap, and the ecosystem is mature. For top quality go to FLUX.1 dev (recommended 16GB+ VRAM). SD 1.5 is light but a generation behind on quality — likely to leave new users wanting more.

Is FLUX a different thing from Stable Diffusion?

Technically related but from a different company. FLUX is from Black Forest Labs, founded by ex-Stability-AI engineers who built SD. It's positioned less as a successor and more as "a higher-quality open image AI." The ecosystems are separate (FLUX LoRAs don't work in SD). But in the "open-weight, locally runnable image AI" category they're the same camp, and both are first-class citizens on Civitai and ComfyUI.

Should I buy a GPU or rent cloud?

Cloud (Runpod / Replicate / Civitai's on-demand) is cheaper if you generate fewer than 50 images a month. Around $0.001–$0.01 per image. If you generate hundreds per month, train your own LoRAs, or refuse to send data off your machine, buying a GPU pays for itself. The cost-effective sweet spot for serious users is a used RTX 3090 (24GB, around $500).

What Is Stable Diffusion — Open-Source Image AI: How It Works, Running Locally, and Commercial Licensing

Four Things That Make It Different

1. August 22, 2022 — The Day Image AI Became Something You Could Download

2. What Is Stable Diffusion — In Three Lines

3. Version Lineage — SD1.5 / SDXL / SD3.5 and the FLUX Split

4. The Reality of Running It Locally — By VRAM Tier

5. The License Trap — Lessons from the SD3 Backlash

6. Civitai / LoRA / ComfyUI — An Ecosystem Bigger Than the Model

7. Midjourney vs Stable Diffusion — Which to Pick

8. Three Pitfalls — Copyright, NSFW, Compatibility

Pitfall ①: Training-data copyright risk

Pitfall ②: NSFW generation is trivially easy

Pitfall ③: Generational compatibility splits

Summary

FAQ

Related Articles

Best 8 Image Generation AI Tools — Compared and Sorted by Use Case

20 Best Generative AI Tools for Game Development: Art, Music, Coding & More

AI Design Tools Compared — Canva, Adobe Firefly, Figma AI, and Recraft by Use Case

How to Use Midjourney — V8.1 Complete Guide: Plans, Five-Layer Prompts, Parameters, and References

Comments

Leave a Comment