telexed ~ c / e01f09bb-37cradar:40 · idea_signalLIVE
← back
NO.
#e01f09bb
Topic
IDEA SIGNALS
Source
Simon Willison
Published
2026-05-20 17:57:45
Importance
★ 4/10 — radar 40

`TokenSpeed` makes LLM output speed visible

Speed claims become easier to judge when they animate like real model output. Useful for product demos and latency expectations, not model ranking.

[ KEY POINTS ]
  1. Simulates 5 to 800 tokens/second, enough to compare slow local runs, normal API streams, and fast batch-like output.
  2. A quoted 30 tokens/second number is hard to feel in specs; visual playback makes perceived wait time obvious.
  3. The app is plain HTML with source available, so the idea is easy to embed in docs, sales pages, or model-picker tools.
Originalsimonwillison.net/2026/May/20/tokens-per-second/#atom-everythingRead original →

// related