#0001
`TokenSpeed` makes LLM output speed visible
40radar
TokenSpeedHTML demo app — visualizes LLM token output speed
Speed claims become easier to judge when they animate like real model output. Useful for product demos and latency expectations, not model ranking.
- Simulates 5 to 800 tokens/second, enough to compare slow local runs, normal API streams, and fast batch-like output.
- A quoted
30 tokens/secondnumber is hard to feel in specs; visual playback makes perceived wait time obvious. - The app is plain HTML with source available, so the idea is easy to embed in docs, sales pages, or model-picker tools.
Source: simonwillison.net/2026/May/20/tokens-per-second/#atom-evRead original →
