Reviews - TheAIStack.org

Rank	Model	Price	Summary
1	LiteLLM Proxy	Open Source	The Universal Standard. It has become the 'Docker' of LLM connectivity. It normalizes inputs/outputs for 200+ providers (OpenAI, Vertex, Bedrock) into a single format, allowing teams to swap models with zero code changes.
2	Portkey	Freemium	The Control Plane. Goes beyond routing to offer 'Virtual Keys' and integrated guardrails. Its 'Semantic Cache' feature can reduce API bills by 30% by serving repeated questions from memory without hitting the model.
3	Cloudflare AI Gateway	Free / Usage	The Infrastructure Edge. Runs on Cloudflare's massive global network. It offers the fastest caching and rate-limiting because it sits closer to the user than any other gateway. Essential for high-traffic public apps.
4	Kong AI Gateway	Enterprise	The Enterprise Fortress. Built on the world's most popular API gateway. It introduces 'Semantic Routing', allowing you to route queries to different models based on the meaning of the prompt (e.g., coding prompts -> Claude, creative prompts -> GPT-5).
5	Not Diamond	Usage Based	The Intelligence Router. Unlike a passive proxy, it is an active 'Model Recommender'. It analyzes every prompt in real-time to route it to the cheapest model that can successfully answer it, often beating GPT-5 quality at 1/10th the cost.
6	OpenRouter	Usage Based	The Marketplace. The easiest way to access new models. It aggregates hundreds of providers, allowing you to use a single credit card and API key to access everything from Llama 4 405B to Claude Opus without managing separate accounts.
7	Helicone	Open Source	The Observability Proxy. While it handles routing, its superpower is distinct visibility. It provides the granular 'Cost per User' and 'Latency per Prompt' metrics that engineering managers need to optimize production apps.
8	Bifrost (Maxim AI)	Open Source	The Speed Demon. Written in Go, it is the fastest open-source gateway with sub-millisecond overhead. It is designed for high-frequency trading or real-time voice agents where every microsecond of latency counts.
9	Javelin	Enterprise	The Security Sentry. An enterprise gateway focused strictly on compliance. It sits between your users and the models to scrub PII, block prompt injections, and enforce data residency rules before the request leaves your VPC.
10	Vercel AI Gateway	Included in Vercel	The Frontend Native. Integrated directly into the Next.js ecosystem. It allows developers to stream UI components from the edge, handling complex model tool-calling logic completely server-side.

Just the Highlights

LiteLLM Proxy

Visit Website

Rank #1

Open Source

The Universal Standard. It has become the 'Docker' of LLM connectivity. It normalizes inputs/outputs for 200+ providers (OpenAI, Vertex, Bedrock) into a single format, allowing teams to swap models with zero code changes.

Portkey

Visit Website

Rank #2

Freemium

The Control Plane. Goes beyond routing to offer 'Virtual Keys' and integrated guardrails. Its 'Semantic Cache' feature can reduce API bills by 30% by serving repeated questions from memory without hitting the model.

Cloudflare AI Gateway

Visit Website

Rank #3

Free / Usage

The Infrastructure Edge. Runs on Cloudflare's massive global network. It offers the fastest caching and rate-limiting because it sits closer to the user than any other gateway. Essential for high-traffic public apps.

Kong AI Gateway

Visit Website

Rank #4

Enterprise

The Enterprise Fortress. Built on the world's most popular API gateway. It introduces 'Semantic Routing', allowing you to route queries to different models based on the *meaning* of the prompt (e.g., coding prompts -> Claude, creative prompts -> GPT-5).

Not Diamond

Visit Website

Rank #5

Usage Based

The Intelligence Router. Unlike a passive proxy, it is an active 'Model Recommender'. It analyzes every prompt in real-time to route it to the cheapest model that can successfully answer it, often beating GPT-5 quality at 1/10th the cost.

OpenRouter

Visit Website

Rank #6

Usage Based

The Marketplace. The easiest way to access new models. It aggregates hundreds of providers, allowing you to use a single credit card and API key to access everything from Llama 4 405B to Claude Opus without managing separate accounts.

Helicone

Visit Website

Rank #7

Open Source

The Observability Proxy. While it handles routing, its superpower is distinct visibility. It provides the granular 'Cost per User' and 'Latency per Prompt' metrics that engineering managers need to optimize production apps.

Bifrost (Maxim AI)

Visit Website

Rank #8

Open Source

The Speed Demon. Written in Go, it is the fastest open-source gateway with sub-millisecond overhead. It is designed for high-frequency trading or real-time voice agents where every microsecond of latency counts.

Javelin

Visit Website

Rank #9

Enterprise

The Security Sentry. An enterprise gateway focused strictly on compliance. It sits between your users and the models to scrub PII, block prompt injections, and enforce data residency rules before the request leaves your VPC.

Vercel AI Gateway

Visit Website

Rank #10

Included in Vercel

The Frontend Native. Integrated directly into the Next.js ecosystem. It allows developers to stream UI components from the edge, handling complex model tool-calling logic completely server-side.

AI Gateways

Just the Highlights

LiteLLM Proxy

Portkey

Cloudflare AI Gateway

Kong AI Gateway

Not Diamond

OpenRouter

Helicone

Bifrost (Maxim AI)

Javelin

Vercel AI Gateway