‌
‌

Safety

Adversarial Attack

Deliberately crafted inputs that trick an AI model into producing wrong or harmful outputs — a key category of AI security threat.

01 ——

In plain English

Adversarial attacks are inputs engineered to break an AI model's normal behaviour. They range from subtle pixel changes that make an image classifier mislabel a stop sign, to prompt injections that hijack an LLM, to "jailbreak" prompts that bypass safety training.

Common types:

Evasion attacks — slightly modify an input so the model classifies it wrong (image perturbations, typos that fool spam filters)
Prompt injection — embed instructions inside data the model reads (a webpage, a PDF) that override the system prompt
Data poisoning — corrupt the training data so the model behaves badly later
Model extraction — query a model enough times to clone its capabilities

Why it matters: As AI gets deployed in higher-stakes settings (medical triage, hiring, autonomous vehicles), adversarial attacks become a real-world security problem. Frontier labs run red-team exercises to find weaknesses before attackers do.

02 ——

Related terms

Prompt Injection

A security attack where malicious instructions hidden in user input or external content trick an AI model into ignoring its real instructions.

A prompt or technique that tricks an AI model into ignoring its safety rules and producing content it would normally refuse.

An attack that corrupts a model’s training data to make it behave incorrectly — either degrading performance or installing hidden backdoors.

Deliberately trying to make an AI model misbehave — find jailbreaks, exploits, and failure modes — before adversaries do.

Rules and filters that constrain what an AI model can output — used to block harmful, off-topic, or non-compliant responses.

The challenge of making AI systems behave in ways that match human values and intentions — not just their literal instructions.

Back to glossaryLast reviewed June 2026

Vol. 4 · Issue 21 · Last reviewed 2026-06-27

Sign up for our newsletter

Receive weekly updates so you can stay up-to-date with the world of AI

Sign up for our newsletter

Receive weekly updates so you can stay up-to-date with the world of AI

AI Tools Directory

The AI tools directory for discovering, exploring, and comparing the most innovative AI tools in the industry

Explore

All AI tools

Top 100 AI tools

Best AI tools

Curated collections

AI tool alternatives

AI categories

Pricing

AI glossary

Compare AI tools

Blog

Methodology

Editorial team

AI graveyard

Research

MCP server

Latest collections

Policy

Terms & conditions

Privacy policy

FAQ

Refund policy

Affiliate disclosure