Safety

Watermarking

Embedding a hidden, machine-detectable signal in AI-generated content so it can later be identified as AI-made.

01 ——

In plain English

Watermarking is the technique of marking AI outputs — text, images, audio, video — with a hidden signal that lets you (or a detector) verify later that the content was AI-generated. It's one of the main proposed defences against misuse of generative AI.

How it works:

  • Text watermarking — biases the model toward certain tokens in a statistical pattern undetectable to humans but detectable to the watermark verifier (Google's SynthID-Text, OpenAI's research)
  • Image watermarking — embed imperceptible perturbations in pixel patterns (SynthID-Image, Stable Signature, Adobe Content Credentials)
  • Audio / video — similar perturbation in spectrograms or frames
  • Metadata — C2PA standard adds signed provenance to file headers

Where it's deployed:

  • Google SynthID — applied to Imagen, Veo, MusicLM, Gemini text outputs
  • Adobe Content Credentials (C2PA) — Photoshop, Firefly
  • Meta — applied to AI-generated imagery on their platforms
  • OpenAI — DALL-E images include C2PA metadata

Limits: Watermarks can be stripped by re-screenshotting, re-encoding, or simply paraphrasing text. Watermarking buys provenance for honest actors but doesn't stop motivated bad actors.

02 ——

Related terms

Back to glossaryLast reviewed May 2026
Vol. 4 · Issue 19 · Last reviewed 2026-05-30

Sign up for our newsletter

Receive weekly updates so you can stay up-to-date with the world of AI