Provider routing
Try a preferred provider first. Set fallback chains. Sort by cost, latency, or throughput. All per-request.
Some models are hosted by multiple providers — Claude lives on Anthropic, AWS Bedrock, and Google Vertex; Llama lives on Meta direct, Together, Fireworks, and a dozen others. Provider routing lets you control which one serves your request, in what order, and what to do if it fails. With the AI SDK, every option is a key on providerOptions.gateway for streamText and friends.
All routing options live under providerOptions.gateway in the AI SDK, or in the request body for the OpenAI Chat Completions / Anthropic Messages APIs.
Try a preferred provider first
Use order to specify the sequence to attempt:
import { streamText } from "ai"
const result = streamText({
model: "anthropic/claude-opus-4.6",
prompt,
providerOptions: {
gateway: { order: ["bedrock", "anthropic"] },
},
})This tries Bedrock first, falls back to Anthropic direct if Bedrock errors or is slow. Other providers (e.g. Vertex) are still available, but only after the listed ones.
Restrict to a subset
Use only to ban specific providers:
providerOptions: {
gateway: { only: ["bedrock", "anthropic"] }, // never use Vertex
}If none of the listed providers are available for the model, the request fails immediately with MODEL_NOT_AVAILABLE_FROM_LISTED_PROVIDERS.
If your project has a model allowlist (Dashboard → Project → Models), the request's only is intersected with the allowlist. Anything not in both is dropped.
Sort by metric
Rank providers by performance or cost:
providerOptions: {
gateway: {
sort: "cost", // 'cost' | 'ttft' (latency) | 'tps' (throughput)
},
}| Value | Meaning | Direction |
|---|---|---|
"cost" | Estimated cost per request | Cheapest first |
"ttft" | Time to first token (median, ms) | Fastest first |
"tps" | Tokens per second (median) | Highest first |
Combine order, only, and sort
providerOptions: {
gateway: {
only: ["anthropic", "bedrock", "vertex"], // restrict to these three
sort: "tps", // among them, fastest first
order: ["anthropic"], // anthropic always tried first
},
}Resolution order: order is prepended → remaining providers are sorted by sort → unmentioned providers (filtered out by only) are dropped.
Model fallbacks
If your primary model is unavailable, fall back to a backup model — not just a backup provider:
streamText({
model: "openai/gpt-5.4", // primary
providerOptions: {
gateway: {
models: [
"anthropic/claude-opus-4.6", // try this if primary fails
"google/gemini-3.1-pro-preview", // and this if both above fail
],
},
},
})The response always tells you which model actually answered (in providerMetadata.gateway.modelAttempts).
Each model in the chain is a fresh round-trip. By default we cap at 3 model attempts per request so a long fallback chain doesn't blow your hot-path latency. Configurable per project under Settings → Routing.
Provider timeouts (BYOK only)
When you bring your own provider keys (BYOK, coming Phase 6), set per-provider timeouts to trigger fast failover:
providerOptions: {
gateway: {
providerTimeouts: {
byok: { openai: 10000 }, // 10s — fail over after that
},
},
}Range: 1,000–789,000ms (~13 min). Once the first token arrives, the timeout is cleared.
Inspecting what actually ran
The response includes a providerMetadata.gateway object with the full attempt history:
const result = streamText({ ... })
console.log(JSON.stringify(await result.providerMetadata, null, 2)){
"gateway": {
"routing": {
"originalModelId": "anthropic/claude-opus-4.6",
"resolvedProvider": "bedrock",
"fallbacksAvailable": ["anthropic", "vertex"],
"modelAttempts": [
{
"modelId": "anthropic:claude-opus-4-6",
"success": true,
"providerAttempts": [
{ "provider": "bedrock", "success": true, "responseTimeMs": 4521 }
]
}
]
},
"cost": "0.0045405",
"generationId": "gen_01A2B3C4D5"
}
}Useful for debugging "why did this take 4 seconds?" — every attempted provider is logged with its response time.
Default behavior
If you don't specify any routing options, Synapse Garden picks providers based on recent uptime and latency, with a small bias toward cost. This is usually what you want for production — set explicit options only when you have a reason to.