`AI Gateway` adds request-time provider ranking controls
Routing can now optimize on price, first-token latency, or throughput at request time instead of Vercel's blended default. Useful when one model has many providers and the cheapest or fastest route materially changes margin or UX.
- Set
sortonproviderOptions.gatewayto'cost','ttft', or'tps'depending on whether margin, snappiness, or long-output speed matters most. - Ranking is computed at request time, so newly added providers, price changes, and observed latency shifts flow through without code changes.
- Fallback is strict: providers are attempted in sorted order, and the next one is used only if the higher-ranked provider is unavailable.
sortworks withZero Data Retentionfiltering and withorder; pinned providers stay first, then the rest follow the chosen ranking.- Each response exposes routing metadata with a
sortblock showing candidates, metric values, attempt order, and health-based deprioritization for debugging.