Telexed

telexed ~ home★4 and up · hourly · UTC+09LIVE

TELEXED// solo-operator signal radar · Issue 412

AI news through a solo-operator lens — only what changes your day3 of 412

FILTER[All][Agents & tools][Models & API][Generative media][Infra & SaaS][ASO & growth][Indie business][Idea signals][Other][★6+ high-signal]

r/LocalLLaMA ✕clear filters

Mon, May 181 dispatches

#0412
#0412Agents & tools r/LocalLLaMAyesterday
`SmallCode` hits 87/100 coding-agent tasks with an active 4B model
50radar
SmallCodeLocal coding agent — compound tools for small models
Reliability comes from the harness, not raw model size. The benchmark is self-reported, but the agent patterns are immediately reusable for local-first coding tools.
- Compound tools collapse search-read-edit-verify into one call, cutting the multi-step drift that breaks small models after 3+ tool calls.
- The fix loop runs compile/lint immediately after edits and feeds errors back, so the model only needs to repair concrete failures.
- On repeated failure, tasks shrink from broad file edits to line-level fixes; that is a practical recipe for weaker local models.
- Cloud escalation is scoped to the stuck task when an OpenAI or Claude key exists, keeping most work local without hard failure.
Source: www.reddit.com/r/LocalLLaMA/comments/1tgecrq/i_built_a_cRead original →
FIG-4121:1
50radar
FIG-4121:1

Sat, May 161 dispatches

#0411
#0411Agents & tools r/LocalLLaMA3 days ago
`Qwen3.6-35B-A3B` reaches **24.6%** on `Terminal-Bench 2.0`
50radar
Qwen3.6Open LLM model — listed on a terminal-agent benchmark
A smaller open model stack beat several larger agent setups on a hard terminal benchmark. Worth testing for local coding-agent loops, but still benchmark-first evidence.
- little-coder x Qwen3.6-35B-A3B scored 24.6% ±3.2, above Gemini 2.5 Pro on Gemini CLI at 19.6%.
- It also edged Qwen3-Coder-480B on Terminus 2 at 23.9%, showing scaffold choice can outweigh raw model scale.
- Qwen3.5-9B reached 9.2%; sub-10B local models now have measurable, nonzero performance on hard agentic tasks.
- This is still a leaderboard signal, not production proof. Try it on repo-specific tasks before replacing API-backed agents.
Source: www.reddit.com/r/LocalLLaMA/comments/1temio0/qwen3635ba3Read original →
50radar
PHOTO
FIG-4111:1

Wed, May 61 dispatches

#0410
#0410Agents & tools r/LocalLLaMA2 weeks ago
`llama.cpp` MTP makes `Qwen 3.6 27B` far more usable for local coding agents
50radar
llama.cppInference engine — lightweight local LLM serving
A custom llama.cpp build stacks MTP, turbo4 KV cache, and 262K context on 48GB Macs. Still a manual setup, but local agentic coding just moved from hobbyist tweak to viable option.
- --spec-type mtp --spec-draft-n-max 5 delivered 2.5x faster generation, reaching 28 tok/s on an M2 Max 96GB.
- turbo4 KV cache cuts KV memory to roughly one quarter, which is the real unlock for long-context local use.
- A 262K context window reportedly fits on 48GB Apple Silicon with Q5_K_M plus turbo4, making repo-scale sessions more realistic.
- The package also ships fixed chat templates and llama-server OpenAI/Anthropic-compatible endpoints, so existing agent stacks need less glue code.
Source: www.reddit.com/r/LocalLLaMA/comments/1t57xuu/25x_faster_Read original →
50radar
PHOTO
FIG-4101:1