Modalities

Text-to-Image

AI that generates new images from a written description — the technology behind tools like Midjourney, DALL-E, and Stable Diffusion.

01 ——

In plain English

Text-to-image is a class of generative AI that takes a written prompt and produces an image matching the description. It's one of the most popular consumer AI categories — millions of people use it for art, marketing, design, and entertainment.

Leading text-to-image tools:

Midjourney — Discord-based, known for artistic quality
DALL-E — built into ChatGPT
Stable Diffusion / Flux — open-weight models, run locally or via API
Adobe Firefly — integrated into Photoshop and Illustrator
Ideogram — strong at rendering text inside images

How it works: Almost all modern text-to-image tools use diffusion models. Recent advances let users control composition (ControlNet), style (LoRAs), and consistency (reference images) far beyond the original prompt.

Browse this directory's image-generation category for dozens more options.

02 ——