telexed ~ c / 2d3a0f0f-4a1radar:80 · infra_saasLIVE
← back
NO.
#2d3a0f0f
Topic
INFRA & SAAS
Source
vercel_blog
Published
2026-05-15 00:00:00
Importance
★ 8/10 — radar 80
`AI Gateway` adds request-time provider ranking controls
FIG-0231:1

`AI Gateway` adds request-time provider ranking controls

Routing can now optimize on price, first-token latency, or throughput at request time instead of Vercel's blended default. Useful when one model has many providers and the cheapest or fastest route materially changes margin or UX.

[ KEY POINTS ]
  1. Set sort on providerOptions.gateway to 'cost', 'ttft', or 'tps' depending on whether margin, snappiness, or long-output speed matters most.
  2. Ranking is computed at request time, so newly added providers, price changes, and observed latency shifts flow through without code changes.
  3. Fallback is strict: providers are attempted in sorted order, and the next one is used only if the higher-ranked provider is unavailable.
  4. sort works with Zero Data Retention filtering and with order; pinned providers stay first, then the rest follow the chosen ranking.
  5. Each response exposes routing metadata with a sort block showing candidates, metric values, attempt order, and health-based deprioritization for debugging.
Originalvercel.com/changelog/sort-providers-by-cost-latency-or-throughput-on-ai-gatewayRead original →

// related