
Ultravox
Real-time speech-native multimodal LLM — Ultravox understands audio directly without separate ASR, achieving 150ms TTFT. Open weights, by Fixie AI.

Overview
Ultravox: Speech-Native Multimodal LLM
Ultravox is a fast multimodal LLM by Fixie AI that understands human speech directly — no separate Automatic Speech Recognition (ASR) stage. The direct audio-to-LLM coupling cuts out a pipeline step that traditional voice agents require, achieving ~150ms time-to-first-token (TTFT) for genuinely real-time conversation.
Open-weight model available on Hugging Face, plus a managed Realtime platform at ultravox.ai for building voice-to-voice agents. Used by developers who want speech-native architecture rather than ASR + LLM + TTS chains.
Key Features
- Direct audio-to-text understanding (no ASR pipeline step)
- ~150ms time-to-first-token
- Open weights on Hugging Face for self-hosting
- Realtime managed platform for voice-to-voice agents
- Multiple model sizes (1B/3B/8B parameters)
Ideal Use Case
Voice agent developers who care about latency above all and want to skip the ASR step; researchers exploring speech-native LLM architectures; teams building voice agents on partner inference platforms (BaseTen, fal.ai).
Why Use Ultravox
Traditional voice agents have a pipeline: STT → LLM → TTS, each adding latency and failure modes. Ultravox collapses STT + LLM into a single model that understands audio directly. Architecturally cleaner, latency-better, and the open-weight release means full control.
FAQ
Q: Does Ultravox replace TTS too? A: Not yet — it understands audio directly but emits text. TTS is still needed for the response. Future versions plan voice-to-voice end-to-end.
Q: Is Ultravox open source? A: Yes — model weights on Hugging Face under permissive license.
Q: Who is Fixie AI? A: The team behind Ultravox; founded by ex-Google folks focused on agentic AI infrastructure.
tl;dr
Speech-native multimodal LLM. Audio → text directly, 150ms TTFT, open weights. The architectural clean voice agent option.
Related
Looking for more options? Browse the AI Infrastructure directory or read our best AI infrastructure tools listicle. Ultravox is also tracked on Crunchbase.
Why Use Ultravox

User Reviews
Similar Tools




