The Index · AI Categories · LLM Gateways & Serving

LLM Gateways & Serving

LLM API routers, gateways, serving infrastructure, and model hosting. Tools that sit between your app and one or more language models.

Tools indexed
13
Reviewed by our editors
Edition
Vol. 4 · Iss. 19
Last reviewed 2026-05-30
Status
Live
Reviewed each edition
Featured · this edition
1 featured
Every listing
Sortable
Sorted by
Nebius llm gateways & serving tool logo

Nebius

Nebius is an AI-native GPU cloud platform that rents NVIDIA H100 through GB200 clusters with managed Slurm, Kubernetes and an inference API.

Paid - Paid
4.93
470
Ollama brand logo mark shown as a square app icon

Ollama

Ollama is a local LLM runtime that downloads, runs, and serves open models on your own hardware via a CLI and an OpenAI-compatible API.

Free
4.93
470
TOGETHER's logo

TOGETHER

Cloud service for developers to build with open-source AI, offering APIs, distributed training systems, and leading open-source models.

Paid - Inquire
4.93
441
Groq llm gateways & serving tool logo

Groq

Enterprise-scale AI solutions for ultra-fast language processing and inference.

Paid - Inquire
4.87
430
Fireworks AI llm gateways & serving tool logo

Fireworks AI

High-speed, cost-efficient generative AI for product innovation with advanced fine-tuning capabilities.

Paid - Inquire
4.92
420
Replicate llm gateways & serving tool logo

Replicate

Cloud platform for running, deploying, and scaling machine learning models with ease.

Paid - Inquire
4.92
420
RunPod llm gateways & serving tool logo

RunPod

Globally distributed GPU cloud for AI tasks.

Paid - Inquire
4.91
410
Modal llm gateways & serving tool logo

Modal

Modal offers an easy way for developers to run code in the cloud with serverless compute and containerized environments.

Paid - $30 /mo
4.86
400
OpenRouter llm gateways & serving tool logo

OpenRouter

Unified API and marketplace for the best LLMs at the best prices for any prompt.

Freemium
4.84
360
Anyscale-logo

Anyscale

Unified compute platform for scalable AI and Python applications using Ray

Paid - Inquire
4.83
360
LiteLLM llm gateways & serving tool logo

LiteLLM

Universal LLM proxy — call 100+ LLMs (OpenAI, Anthropic, Bedrock, Vertex) with one API.

Freemium
4.75
325
Voltage Park llm gateways & serving tool logo

Voltage Park

Voltage Park is a GPU cloud platform that rents NVIDIA H100 and Blackwell clusters on-demand or on dedicated reserve for AI training and inference.

Paid - Paid
4.75
240
BentoML llm gateways & serving tool logo

BentoML

Platform for software engineers to build AI applications.

Paid - Inquire
4.63
113
Related categories

Collections featuring these tools

Vol. 4 · Issue 19 · Last reviewed 2026-05-30

Sign up for our newsletter

Receive weekly updates so you can stay up-to-date with the world of AI