World Model
A model that learns to simulate how the world (or a specific environment) evolves — predicting what happens next given an action.
In plain English
A world model is an AI model that learns the dynamics of an environment well enough to simulate it: given a state and an action, predict the next state. Originally a reinforcement-learning concept, world models have re-emerged as a frontier bet for video generation, game engines, and embodied AI.
Why they matter:
- Embodied AI — a robot with a world model can plan ahead instead of just reacting
- Video generation — coherent video over time is essentially a world model in pixel space
- Game engines — generative games where the world is computed on the fly, not pre-built
- Data generation — simulate scenarios for training agents at scale
- Reasoning — let an agent "think through" consequences before acting
Notable world-model efforts:
- Genie 3 (Google DeepMind) — playable interactive worlds from a prompt
- Sora 2 (OpenAI) — long-form coherent video
- V-JEPA (Meta) — non-generative video world model
- Nvidia Cosmos — physical-world foundation model for robotics
- DeepMind SIMA — agent that plays many games via a learned world model
Open question: Whether scaling world models becomes a parallel route to general intelligence alongside LLMs, or a complementary capability.