Diffusion Model
The type of AI model behind most modern image and video generators — it learns to create content by reversing a noising process.
In plain English
A diffusion model is a generative model that learns to produce images, video, or audio by reversing a process of gradually adding noise. During training, the model sees real images progressively corrupted with random noise. It then learns to reverse the corruption — denoising step by step until a coherent image emerges.
Where you'll see them:
- Image generation — DALL-E, Midjourney, Stable Diffusion, Flux
- Video generation — Sora, Runway, Pika
- Audio generation — music and sound effect generators
- 3D and design — emerging applications
Why diffusion replaced GANs: Earlier image generators used GANs (Generative Adversarial Networks), but they were unstable and hard to train. Diffusion models are more reliable and produce higher-quality, more controllable outputs — which is why almost every major image generator now uses them.