OpenRouter alternatives in 2026: pricing, routing, self-host

OpenRouter solved a real problem in 2023: stop signing up for ten provider accounts. Hand them a credit card, get one API key, hit 400+ models. For a prototype or a side project, the breadth still wins. For a production deployment, two things have changed since then: the 5.5% credit-purchase fee compounds at scale, and the governance layer hasn't kept up with what teams need before they trust a gateway with their billing relationship.

An LLM gateway is a proxy that sits between your application and one or more model providers, handling authentication, routing, rate limits, budgets, and observability. OpenRouter is the largest managed gateway by catalog breadth, with 400+ models from 30+ providers behind a single OpenAI-compatible endpoint.

This post covers the gateways teams actually evaluate when they're moving past prototype. We've run real workloads through each of these. The grades are observational, not aspirational.

Why teams move off OpenRouter

The decision usually starts with one of three triggers:

The first is the bill math. At $200/month in API spend the 5.5% credit fee is $11; you don't notice. At $20,000/month it's $1,100, which is a person's salary if you accumulate it across the year. The fee is a flat surcharge — no markup variance, no cache discount, no enterprise carve-out by default. If you're spending more than $5K/month on inference, the fee alone justifies a comparison.

The second is governance gaps. OpenRouter's per-key controls landed in 2024 and got better in 2025, but the model is still that the credits are pooled at the account level. You can't say "this key has $200/month, no more, hard stop, return 402." You can set rate limits and per-model spend limits, but the atomic budget cap that returns a 402 when crossed isn't natively part of the surface. Teams running multi-tenant SaaS care about this; teams running internal tools don't.

The third is self-host requirements. Regulated industries, EU residency, the rare team building on Cloudflare or AWS that wants the gateway in their VPC. OpenRouter is fully managed — there's no self-host option. If you need the proxy in your infrastructure, you're looking at LiteLLM, Bifrost, or rolling your own.

The seven gateways worth evaluating

LiteLLM

LiteLLM is the open-source proxy server that most self-hosted gateway deployments end up running. Python, Apache 2.0, supports 100+ providers through an OpenAI-compatible interface, and the feature surface — virtual keys with budgets, Redis caching, fallbacks, semantic similarity caching, webhook callbacks — is the deepest among the open-source options.

What it costs. The software is free. Hosting isn't. A typical production deployment runs $200-500/month in cloud infrastructure (small VM, Redis, monitoring). Engineering time to set it up is 2-5 days; ongoing ops is "occasional, when something breaks." Like OpenRouter, LiteLLM passes through provider rates with no markup of its own.

Pick LiteLLM if you need the gateway inside your VPC, you have a platform team that already runs services, and you want deep customization (custom routing logic, provider-specific tuning, cache strategies). The comparison from TrueFoundry covers the operational side well.

Skip LiteLLM if you don't have anyone who wants to be on call for it.

Portkey

Portkey is a managed gateway that goes deeper on production reliability than catalog breadth. Their config-as-data fallbacks — "if gpt-5.4 returns 5xx or P95 > 800ms, fall back to claude-sonnet-4-6" — are the most expressive in the category. Virtual keys with real per-project budgets work the way you'd expect. SOC 2 Type 2, HIPAA, EU data residency.

What it costs. Free tier with caps. Paid tiers from low double-digits per month, scaling with volume. Enterprise pricing is custom but not exotic.

Pick Portkey if failover is the load-bearing requirement. The config language is worth learning if your reliability story can't tolerate single-provider outages.

Skip Portkey if you mostly need observability — Helicone goes deeper there — or breadth, where OpenRouter still wins.

Helicone

Helicone is primarily an observability tool that happens to be a gateway. Logs every request. Their session-tracing UI is genuinely good for debugging multi-turn agent flows. Prompt versioning is built in, which solves a real problem most teams kludge with git diffs.

What it costs. Free tier with significant log volume. Paid tiers based on log volume + retention. Open-source self-host option.

Pick Helicone if your bottleneck is "we don't know why the bot said that." The tracing and prompt-version comparison are differentiators.

Skip Helicone if budgets and routing are the priority — those exist but aren't the focus.

Cloudflare AI Gateway

Cloudflare AI Gateway is what to use if you're already on Workers. Built into the Cloudflare control plane. Logs, caching, rate limiting, and bring-your-own-key for any provider that has an OpenAI-compatible endpoint.

What it costs. Free at modest volumes (10K requests/day on the free tier as of early 2026). Paid usage stacks onto the standard Cloudflare plan, which most teams using Workers already have.

Pick Cloudflare AI Gateway if your stack is already on Cloudflare. Zero new vendor relationship.

Skip it if you're not on Cloudflare. The integration is the whole pitch.

Eden AI

Eden AI is the breadth answer for non-LLM AI services. 500+ models across not just text but also OCR, speech-to-text, object detection. If your product touches multiple AI modalities, having one bill for "all the AI work" makes sense.

What it costs. Per-call pricing per provider, with Eden's markup added. The markup varies by model and is documented in their pricing dashboard. Worth the math vs going direct for each one.

Pick Eden AI if you're using multiple modalities and value the unified billing more than the marginal cost.

Skip it if you're LLM-only — the per-LLM price isn't competitive with OpenRouter.

Bifrost (open-source, by Maxim AI)

Bifrost is a newer open-source gateway that benchmarks at very low overhead — Maxim publishes 11 microseconds at 5,000 RPS, which is faster than anything else in the category by an order of magnitude. Self-host, governance, observability, and routing in one package. Newer than the established options, so the ecosystem is thinner.

Pick Bifrost if raw latency is the gating concern and you're comfortable adopting a younger project.

Skip Bifrost if you need integrations the ecosystem doesn't cover yet, or if "production-tested by lots of other companies" is part of your evaluation criteria.

Synapse Garden (this is us)

We're a managed gateway with hard per-project budget caps that return HTTP 402 atomically when crossed, OpenAI- and Anthropic-compatible APIs, and a flat 10% markup over passthrough provider cost. Built on top of Vercel AI Gateway as our routing layer, which gives us roughly 100+ models — narrower than OpenRouter's catalog, broader than the per-provider direct integrations.

What it costs. Free tier with 1M tokens/month, no card. Paid plans from $10/month. The 10% markup is consistent across models and stages — no markup-on-cache-hits, no surprise fees. The math lives publicly on /legal/pricing-disclosure.

Pick us if per-project governance and predictable pricing are the primary concerns, and you don't need a 400+ model catalog.

Skip us if you do need the long tail (specific open-source fine-tunes, rare regional deployments) — OpenRouter still wins on breadth.

The honest comparison matrix

We tested each gateway against the same gpt-5.4-mini and claude-sonnet-4-6 workload at 100 RPS over a one-week window. Grades reflect observed behavior on that workload, not feature lists.

Concern	OpenRouter	LiteLLM	Portkey	Helicone	Cloudflare	Synapse Garden
Catalog breadth	A+	A	B+	B+	B	B+
Per-project hard budgets	C	A	A	B+	B	A
Failover / routing rules	B	A	A+	B	B	B
Observability depth	C	B	B	A+	B	B
Pricing transparency	B (5.5% fee)	A	B	B+	A	A (flat 10%)
Self-host option	n/a	yes	yes (paid)	yes	n/a	no
Open-source license	n/a	Apache 2.0	partial	MIT	n/a	no
Gateway overhead (P50)	~10ms	depends on host	~12ms	~14ms	~5ms (edge)	~15ms
Time to first request	2 min	2 days	5 min	5 min	10 min	3 min

These grades are observational. A team running highly regulated workloads will weight self-host and EU residency higher; a startup will weight time-to-first-request. Decide your weights before you read the matrix.

How to choose without a spreadsheet

The fastest decision tree we've seen work:

You're prototyping, no real users yet. Stay on OpenRouter. The breadth and the free models are unmatched for trying things. Move when the bill becomes a line item someone asks about.

You're past prototype, single-tenant. LiteLLM if you have ops; Portkey or us if you don't. The choice between Portkey and us comes down to whether failover routing rules or atomic per-project budgets is the bigger gap in your current setup.

You're past prototype, multi-tenant SaaS. Per-project keys with hard budgets become non-negotiable — without them, one customer's runaway loop is everyone's problem. Portkey, us, or self-hosted LiteLLM with the virtual-keys feature.

You're regulated (HIPAA, finance, EU residency). Self-host LiteLLM or Bifrost; or Portkey enterprise tier. Synapse Garden is in us-east-1 only as of May 2026 — we won't be the right answer for EU residency yet.

You're already on Cloudflare. Try Cloudflare AI Gateway first. It's free at small scale and the integration cost is zero.

You're spending $20K+/month and want margin back. Self-host LiteLLM. The 5.5% OpenRouter fee or 10% gateway markup compounds; LiteLLM passes through at provider rates and your only cost is the hosting (a fixed $200-500/month at most volumes).

What you can layer

Most production stacks end up with two of these, not one:

OpenRouter + Helicone. Breadth + observability. Governance is on you.
Synapse Garden + Helicone. Governance + observability. This is what most of our customers run.
LiteLLM + Helicone. Self-hosted everything.
Portkey + LangSmith. Reliability + evals. Common at teams doing serious prompt engineering.

Don't layer three. Each layer adds a place where headers can be dropped, latency compounds, and rate limits multiply. Two is the right number.

What we'd actually pick today

If we were starting from zero with a small team in May 2026:

First week: OpenRouter. Get something working. The free models hide the bill question while you figure out the product.
First paying customers: Synapse Garden (us) or Portkey for governance, layer Helicone on top if you find yourself debugging prompts often.
>$10K/month spend: time to do the LiteLLM math. Self-hosting starts to pay back.
Compliance constraints: Portkey enterprise, or self-hosted LiteLLM/Bifrost. Don't pick a managed gateway that can't sign your DPA.

We're not the right answer for every team. The right gateway is the one that solves the bottleneck you have now, not the one with the longest feature list.

If you want to dig deeper into the architecture side, the per-project API keys post covers what governance actually buys you, and the 50ms proxy overhead benchmark covers the latency math regardless of which gateway you pick.