telexed ~ c / 61cc373b-163radar:80 · agent_toolLIVE
← back
NO.
#61cc373b
Topic
AGENTS & TOOLS
Source
Hacker News · MRR
Published
2026-05-17 15:37:07
Importance
★ 8/10 — radar 80
`Semble`, token-light code search for agents using **98% fewer tokens** than grep
FIG-0611:1

`Semble`, token-light code search for agents using **98% fewer tokens** than grep

CPU-only hybrid retrieval gives coding agents a cheaper first pass before grep and full-file reads. Worth testing on large repos where context waste slows every task.

[ KEY POINTS ]
  1. Combines potion-code-16M static embeddings with BM25, RRF, and code-aware reranking; no API key, GPU, or external service required.
  2. Benchmarked on about 1,250 query/document pairs across 63 repos and 19 languages; reported NDCG@10 is 0.854.
  3. Indexing is about 250ms for a typical benchmark repo and queries run around 1.5ms on CPU; very large repos may take longer.
  4. Ships an MCP server for Claude Code, Cursor, Codex, and OpenCode, so it can slot into existing agent workflows.
Originalgithub.com/MinishLab/sembleRead original →

// related