‌
‌

Training

Transformer

The neural network architecture introduced in 2017 that powers nearly every modern LLM, image generator, and AI breakthrough.

01 ——

In plain English

The Transformer is a neural network architecture introduced by Google researchers in the 2017 paper "Attention is All You Need." It became the foundation for nearly every major AI breakthrough since: GPT, BERT, Claude, Gemini, Stable Diffusion, and most others all use transformers.

Why transformers won:

Attention mechanism — the model decides which parts of the input matter most for each output token
Parallelisation — earlier architectures (RNNs) processed text one word at a time; transformers process the whole input at once, training much faster
Scalability — quality keeps improving as you scale up data, parameters, and compute

Variants:

Decoder-only — GPT, Claude, Llama (used for chat and generation)
Encoder-only — BERT (used for classification and search)
Encoder-decoder — T5, BART (used for translation and summarisation)
Vision Transformers (ViT) — used in computer vision

The transformer is the most important AI invention of the last decade.

02 ——

Related terms

Large Language Model — the type of AI behind tools like ChatGPT and Claude, trained to understand and generate text.

A computing system inspired by the brain, made up of layers of connected "neurons" that learn patterns from data — the building block of modern AI.

A type of machine learning that uses layered neural networks to learn complex patterns — the foundation of modern AI.

Generative Pre-trained Transformer — the architecture behind OpenAI's models, and now used as shorthand for any LLM-powered chatbot.

Foundation Model

A large, general-purpose AI model trained on broad data that can be adapted (via prompting or fine-tuning) to many downstream tasks.

Back to glossaryLast reviewed June 2026

Vol. 4 · Issue 21 · Last reviewed 2026-06-27

Sign up for our newsletter

Receive weekly updates so you can stay up-to-date with the world of AI

Sign up for our newsletter

Receive weekly updates so you can stay up-to-date with the world of AI

AI Tools Directory

The AI tools directory for discovering, exploring, and comparing the most innovative AI tools in the industry

Explore

All AI tools

Top 100 AI tools

Best AI tools

Curated collections

AI tool alternatives

AI categories

Pricing

AI glossary

Compare AI tools

Blog

Methodology

Editorial team

AI graveyard

Research

MCP server

Latest collections

Policy

Terms & conditions

Privacy policy

FAQ

Refund policy

Affiliate disclosure