Computer Use
The capability for an AI model to control a computer the way a human does — moving the mouse, clicking, typing, reading the screen.
In plain English
Computer Use, originally Anthropic's term for a Claude capability launched in late 2024, is the broader pattern of an AI model controlling a desktop environment via screenshots, keyboard, and mouse. The model sees pixel-level screenshots, decides what to click, and sends synthetic input events.
Why it's significant: APIs are everywhere, but most enterprise software lives behind a GUI. Computer Use lets an agent operate any application — legacy ERPs, design tools, spreadsheets, video editors — without an integration. It's the most general possible interface.
Trade-offs:
- Universal — works on any software with a UI
- Slow — every step is a screenshot + reasoning round-trip
- Fragile — pixel coordinates break with theme/resolution changes
- Risky — a misclick in a real environment has real consequences (delete files, send emails)
Implementations: Anthropic Computer Use, OpenAI Operator (browser-only variant), Google Project Mariner, open-source Browser Use and Skyvern. Most production deployments run in sandboxed VMs to contain mistakes.