Modalities

Text-to-Image

AI that generates new images from a written description — the technology behind tools like Midjourney, DALL-E, and Stable Diffusion.

01 ——

In plain English

Text-to-image is a class of generative AI that takes a written prompt and produces an image matching the description. It's one of the most popular consumer AI categories — millions of people use it for art, marketing, design, and entertainment.

Leading text-to-image tools:

  • Midjourney — Discord-based, known for artistic quality
  • DALL-E — built into ChatGPT
  • Stable Diffusion / Flux — open-weight models, run locally or via API
  • Adobe Firefly — integrated into Photoshop and Illustrator
  • Ideogram — strong at rendering text inside images

How it works: Almost all modern text-to-image tools use diffusion models. Recent advances let users control composition (ControlNet), style (LoRAs), and consistency (reference images) far beyond the original prompt.

Browse this directory's image-generation category for dozens more options.

02 ——

Related terms

Back to glossaryLast reviewed May 2026
Vol. 4 · Issue 19 · Last reviewed 2026-05-30

Sign up for our newsletter

Receive weekly updates so you can stay up-to-date with the world of AI