Agents & tools

Voice Agent

A real-time conversational AI you talk to — over the phone, in an app, or through a wearable — that listens, reasons, and replies in voice.

01 ——

In plain English

A voice agent is an AI agent whose primary interface is spoken conversation. It listens to the user (speech-to-text), reasons about a response (LLM), and speaks back (text-to-speech), usually with low enough latency to feel like a real conversation.

What makes a good voice agent:

  • Low latency — under 800ms turn-taking feels natural; over 1.5s feels broken
  • Interruption handling — barge-in support, no awkward step-on-each-other
  • Persona consistency — the voice, tone, and personality stay coherent
  • Tool use — book appointments, look up orders, transfer to a human
  • Memory — recognises returning callers and remembers context

Where they're deployed:

  • Customer service — Sierra, Decagon, Parloa, Cresta
  • Outbound sales — Air, Bland
  • Healthcare — Hippocratic AI, Suki
  • Real estate / scheduling — Lindy, Goodcall, Synthflow
  • Developer infrastructure — Vapi, Retell, ElevenLabs Conversational AI

Voice agents are eating phone trees. The bar for what counts as "the human alternative" has risen dramatically since 2024.

02 ——

Related terms

Back to glossaryLast reviewed May 2026
Vol. 4 · Issue 19 · Last reviewed 2026-05-30

Sign up for our newsletter

Receive weekly updates so you can stay up-to-date with the world of AI