The Index · AI Categories · Voice AI

Voice AI

Voice AI — speech-to-text, text-to-speech, voice agents, and real-time transcription. Tools that listen, speak, and hold conversations.

Tools indexed
45
Reviewed by our editors
Edition
Vol. 4 · Iss. 19
Last reviewed 2026-05-30
Status
Live
Reviewed each edition
Narrow by sub-topic
Featured · this edition
3 featured
Editor's Picks

Where to start

Best for · Realistic text-to-speech and voice cloning
ElevenLabs ai audio creation tool logo

ElevenLabs

AI Audio Creation
Free Trial
4.92
494
Best for · AI phone agents that make and take calls
Bland AI voice ai tool logo

Bland AI

Voice AI
Paid - Inquire
4.92
440
Best for · Developer platform for voice agents
Vapi developer tools tool logo

Vapi

Developer Tools
Freemium
4.91
400
Best for · Fast, accurate speech-to-text API
Deepgram ai audio creation tool logo

Deepgram

AI Audio Creation
Paid - Inquire
4.81
320
Best for · Low-latency voice models for real-time apps
Cartesia AI Voice Logo

Cartesia

AI Audio Creation
Freemium
4.92
420
Best for · Conversational voice agents for customer service
PolyAI voice ai tool logo

PolyAI

Voice AI
Paid - Inquire
4.92
397
Every listing
Sortable
Sorted by
ElevenLabs voice ai tool logo

ElevenLabs

Explore advanced text-to-speech and voice cloning software for lifelike voiceovers and content generation.

Free Trial
4.92
494
Bland AI voice ai tool logo

Bland AI

Conversational AI that runs sales, support, and scheduling phone calls at scale.

Paid - Inquire
4.92
440
Cartesia AI Voice Logo

Cartesia

Real-time voice AI platform with low-latency speech, cloning, and TTS APIs.

Freemium
4.92
420
Vapi voice ai tool logo

Vapi

Developer platform to build, test, and deploy advanced voice agents in minutes.

Freemium
4.91
400
PolyAI voice ai tool logo

PolyAI

World's most lifelike voice AI agents for enterprise — PolyAI deflects up to 80% of transactional calls without escalation. Banks, airlines, hospitality use it.

Paid - Inquire
4.92
397
Synthflow voice ai tool logo

Synthflow

End-to-end voice AI platform for enterprise call automation — $20M Series A (Accel), 1000+ customers, 45M calls handled, 99.9% uptime.

Freemium
4.82
335
Cerence voice ai tool logo

Cerence

Automotive AI assistant platform — in-car voice AI deployed in 500M+ vehicles globally. Public ($CRNC). Spun out from Nuance.

Paid - Inquire
4.82
320
Phonely voice ai tool logo

Phonely

AI phone agents that answer every call, book appointments, and handle customer support 24/7 — trusted by 10,000+ businesses, YC-backed, fast setup.

Freemium
4.82
320
LMNT voice ai tool logo

LMNT

Fast, lifelike, affordable AI speech — studio-quality voice clones with 150ms latency. 24 languages. The TTS pick for cost-sensitive voice agents.

Freemium
4.82
320
Pipecat voice ai tool logo

Pipecat

Open-source Python framework for real-time voice and multimodal conversational agents — by Daily, the WebRTC infrastructure leader. Most-used voice agent OSS.

Free
4.82
320
Deepgram voice ai tool logo

Deepgram

Advanced AI Speech-to-Text and Voice Recognition Solutions

Paid - Inquire
4.81
320
DeepScribe voice ai tool logo

DeepScribe

AI medical scribe automating clinical documentation across 50+ specialties. Used by 800+ healthcare orgs; saves 2-3 hours per clinician per day.

Freemium
4.83
314
Nabla voice ai tool logo

Nabla

AI copilot for clinicians — ambient scribe + clinical assistant. European leader; deployed across The Permanente Medical Group's 24,000 clinicians.

Paid - Inquire
4.84
290
Retell AI voice ai tool logo

Retell AI

Build advanced conversational voice AI with rapid response times.

Paid - Inquire
4.79
285
Rime AI voice ai tool logo

Rime AI

Realistic conversational TTS designed specifically for voice agents and contact centers.

Paid - Inquire
4.75
261
Recall.ai voice ai tool logo

Recall.ai

Meeting recording infrastructure API for AI products. Powers Otter, Granola, Read.ai, and Fathom under the hood. Sequoia-backed.

Paid - Paid
4.8
253
Voicemod voice ai tool logo

Voicemod

Voicemod is the real-time AI voice changer used by streamers, gamers, and creators. AI voice cloning, soundboard, text-to-speech. 30M+ users.

Freemium
4.77
245
Goodcall voice ai tool logo

Goodcall

AI Phone Agent and Virtual Receptionist for service businesses — 3rd-gen platform with 300ms latency, 100% accuracy. Real estate, home services, contact centers.

Freemium
4.76
240
Rev AI voice ai tool logo

Rev AI

Enterprise-grade speech-to-text and voice AI APIs from Rev — best-in-class English accuracy.

Paid - Inquire
4.8
236
Spara voice ai tool logo

Spara

GTM AI agents for chat, email, voice & SMS — automated demos, bi-directional Salesforce sync.

Paid - Inquire
4.86
230
Vocode voice ai tool logo

Vocode

Build, deploy, and scale hyperrealistic voice AI agents.

Paid - Inquire
4.75
221
Marr Labs logo

Marr Labs

AI voice agent for sales, support, and customer engagement.

Paid - Inquire
4.75
215
Bosh.ai voice ai tool logo

Bosh.ai

Customizable AI Sales Rep co-designed with top sales leaders — handles outreach, conversations, meeting booking. Built on Relevance AI, behavioral-data-driven personalization.

Paid - Inquire
4.64
200
Newo.ai voice ai tool logo

Newo.ai

Low-code platform for ultra-realistic Voice AI Employees — receptionist agents in 3 minutes from a website URL. AI call center with 100+ concurrent calls.

Freemium
4.64
200
Salient voice ai tool logo

Salient

AI voice agents purpose-built for compliant consumer lending — Taylor handles inbound and outbound across welcome, verification, payments, hardship, and collections.

Paid - Inquire
4.64
200
Camb.ai voice ai tool logo

Camb.ai

Camb.ai is a multilingual voice cloning and translation platform supporting 140+ languages. Used by content creators, dubbing studios, and global brands.

Freemium
4.83
190
Simple Phones voice ai tool logo

Simple Phones

AI-powered phone agent for missed call management.

Free Trial
4.6
185
Notevibes voice ai tool logo

Notevibes

Online text to speech converter with natural voices.

Paid - Inquire
4.6
185
WellSaid Labs voice ai tool logo

WellSaid Labs

Enterprise voice AI platform with studio-quality narrated avatars. Used by Coursera, BambooHR, McKinsey for training, marketing, and product.

Paid - Paid
4.84
181
Audioread voice ai tool logo

Audioread

AI-driven platform to convert text into podcast-style audio.

Paid - $9.99 /mo
4.62
180
Aircover voice ai tool logo

Aircover

Real-time AI sales coach inside the call — surfaces battlecards, objection handlers, and next-best-asks live during Zoom/Teams meetings.

Paid - Inquire
4.64
175
Guava voice ai tool logo

Guava

AI-powered platform for conversational intelligence and voice automation.

Paid - Inquire
4.59
169
Regal voice ai tool logo

Regal

Regal is the AI voice agent platform for sales and customer engagement. $40M+ raised, Emergence Capital-led. Used by high-velocity B2C revenue teams.

Paid - Paid
4.47
165
KrispCall voice ai tool logo

KrispCall

AI-powered cloud phone with voice clarity, transcription, and call routing. Virtual numbers in 100+ countries; Krisp noise cancellation built in.

Paid - Paid
4.83
164
Level AI voice ai tool logo

Level AI

Level AI is the generative AI platform for contact centers — automated QA, real-time agent assist, and call summarization. ~$65M Series C; Battery Ventures

Paid - Paid
4.47
155
Slang.ai voice ai tool logo

Slang.ai

24/7 voice AI service for restaurants to handle calls.

Paid - $49 /mo
4.62
155
Toyo voice ai tool logo

Toyo

Real-time voice AI infrastructure. Top 3 Product of the Day. Sub-second latency for production voice agents at scale.

Freemium
4.6
145
Boson AI voice ai tool logo

Boson AI

Boson AI is an audio foundation model company that builds the Higgs Audio models for text-to-speech, speech-to-text and audio understanding.

Freemium
4.67
140
Suki AI voice ai tool logo

Suki AI

AI-powered voice assistant for clinicians to reduce administrative burden.

Paid - $299 /mo
4.29
136
aiOla voice ai tool logo

aiOla

aiOla is an enterprise voice AI platform whose Jargonic ASR model turns noisy, jargon-heavy speech into structured data for frontline teams.

Paid - Paid
4.63
135
Fish Audio voice ai tool logo

Fish Audio

Open-source voice cloning and TTS — competitive with ElevenLabs at a fraction of the cost.

Freemium
4.62
134
Ellipsis Health voice ai tool logo

Ellipsis Health

Ellipsis Health is a voice AI care-management platform that calls complex patients for triage, coordination, and enrollment using vocal-biomarker technolog

Paid - Paid
4.64
132
Soniox official company logo for the AI tool

Soniox

Soniox is a speech AI platform offering real-time multilingual speech-to-text, translation, and text-to-speech across 60+ languages through one API.

Freemium
4.47
120
SuperDial voice ai tool logo

SuperDial

Voice AI that automates healthcare revenue cycle calls — eligibility, prior auth, claim status, credentialing. $15M Series A in early 2026.

Paid - Inquire
4.62
113
Aktify voice ai tool logo

Aktify

Aktify is the autonomous AI SMS sales rep that texts inbound and outbound leads at scale. Handles two-way conversations, qualifies leads, and books meeting

Paid - Paid
4.59
91
Related categories
Questions

Voice AI AI, answered

What are the best AI voice tools?

It depends on the task. For text-to-speech and voice cloning, ElevenLabs and Cartesia lead; for AI phone calls, Bland and Vapi build agents that talk to customers; for transcription, Deepgram converts speech to text in real time. Choose by whether you need to generate a voice, run a conversation, or transcribe audio.

What is the best AI text-to-speech tool?

ElevenLabs is the most widely used for natural, expressive synthetic voices and cloning, with Cartesia and LMNT competing on low latency for real-time apps. ElevenLabs covers many languages and a large voice library, while Cartesia targets fast, streaming generation. Compare on the voices and latency your use case needs.

What is an AI voice agent?

An AI voice agent answers and places phone calls autonomously, understanding speech, responding in a natural voice, and taking actions like booking or routing. Bland, Vapi, and PolyAI power use cases from receptionists to support lines. The agent combines speech-to-text, a language model, and text-to-speech into one real-time loop.

How do I build a voice agent?

Most teams use a platform that bundles the pieces rather than wiring them by hand. Vapi and Synthflow handle telephony, turn-taking, and the speech models so you focus on the conversation logic, while ElevenLabs or Cartesia supply the voice and Deepgram the transcription. You define the script, tools, and handoff rules.

Can AI clone a voice?

Yes. Tools like ElevenLabs create a synthetic copy of a voice from a short sample, used for narration, localization, and accessibility. Because cloning can be misused, reputable tools require consent and add safeguards, and several regions now regulate synthetic voice. Use cloning only with permission from the voice owner.

What is the difference between speech-to-text and text-to-speech?

Speech-to-text transcribes spoken audio into written words, which powers captions, transcription, and the listening side of voice agents, as Deepgram does. Text-to-speech does the reverse, turning written text into spoken audio, as ElevenLabs does. A full voice agent uses both, plus a language model in between.

Vol. 4 · Issue 19 · Last reviewed 2026-05-30

Sign up for our newsletter

Receive weekly updates so you can stay up-to-date with the world of AI