telexed ~ c / c7df03c2-d4fradar:50 · agent_toolLIVE
← back
NO.
#c7df03c2
Topic
AGENTS & TOOLS
Source
r/ClaudeAI
Published
2026-05-04 11:12:10
Importance
★ 5/10 — radar 50

Route Claude's Mechanical Work to a Cheap Side Model

Most spend came from low-value bulk chores, not reasoning-heavy work. A deny-list in CLAUDE.md plus a cheap OpenAI-compatible worker cut that layer to $0.41 for 217 calls; easy win if you review outputs anyway.

[ KEY POINTS ]
  1. Over 3 weeks, 217 formatting, extraction, classification, and skim-summary calls were offloaded for $0.41 instead of roughly $7 on Sonnet.
  2. The routing rule worked best as a deny list in CLAUDE.md: do not use Claude for JSON formatting, field extraction, file classification, or summaries you'll review anyway.
  3. This is a supervised worker, not an agent: no tool use, no file access, no chains, with 3-25s latency and human review on every output.
  4. The tool is just text in, text out; default is DeepSeek V4 Flash, but any OpenAI-compatible endpoint works, including ollama, vllm, and LM Studio.
Originalwww.reddit.com/r/ClaudeAI/comments/1t3elab/most_of_my_claude_usage_was_on_work_that_didnt/Read original →

// related