`SmallCode` hits 87/100 coding-agent tasks with an active 4B model

Reliability comes from the harness, not raw model size. The benchmark is self-reported, but the agent patterns are immediately reusable for local-first coding tools.

[ KEY POINTS ]

Compound tools collapse search-read-edit-verify into one call, cutting the multi-step drift that breaks small models after 3+ tool calls.
The fix loop runs compile/lint immediately after edits and feeds errors back, so the model only needs to repair concrete failures.
On repeated failure, tasks shrink from broad file edits to line-level fixes; that is a practical recipe for weaker local models.
Cloud escalation is scoped to the stuck task when an OpenAI or Claude key exists, keeping most work local without hard failure.

Originalwww.reddit.com/r/LocalLLaMA/comments/1tgecrq/i_built_a_coding_agent_that_gets_87_on_benchmarks/Read original →

// related

#0001
#0001Agents & tools GitHub Changelog23 hours ago
GitHub adds REST API auditing for `Copilot` cloud agent repo config
60radar
Repo-level agent settings can now be checked by API instead of manual UI review. Useful for keeping automation permissions visible before cloud agents touch production code.
- New endpoint: Get Copilot cloud agent configuration for a repository, currently in public preview.
- Best fit is policy drift checks across repos: scan whether agent access and configuration match your expected defaults.
- This is governance plumbing, not a coding-speed feature. Worth adopting if Copilot agents run on real repos.
Source: github.blog/changelog/2026-05-18-audit-repository-copiloRead original →
FIG-0011:1
60radar
FIG-0011:1
#0002
#0002Agents & tools GitHub Changelog24 hours ago
`Copilot Spaces API` is now generally available
70radar
Copilot SpacesGitHub Copilot feature — manages task-specific context spaces
Spaces can now be managed from your own apps via API. Useful for wiring repo context into internal tools or repeatable agent workflows.
- The API supports create, read, update, and delete for Spaces, so context setup no longer has to stay inside GitHub UI.
- Good fit for templates: bootstrap a project space per repo, customer, or feature branch and keep agent context consistent.
- This is more automation surface than end-user feature. Value depends on whether Copilot Spaces is already part of the coding workflow.
Source: github.blog/changelog/2026-05-18-copilot-spaces-api-now-Read original →
FIG-0021:1
70radar
FIG-0021:1
#0003
#0003Agents & tools r/ClaudeAI24 hours ago
11 Claude Habits That Compound Over Daily Use
50radar
The useful part is not prompt tricks, but persistent context: Projects, CLAUDE.md, styles, skills, and subagents. Worth turning into a default setup before long coding sessions.
- Put codebase context, style guides, and prior PRs into Projects once. Re-pasting the same background is pure context tax.
- A custom style like skeptical senior engineer changes review quality by forcing pushback instead of agreeable code comments.
- In Claude Code, CLAUDE.md carries more weight than session prompts. Around 80 lines of project context can remove repeated stack explanations.
- Use cheaper/faster models by task: Sonnet as default, Opus for architecture, Haiku for batch cleanup like tickets or emails.
- Subagents fit parallel chores: run tests, inspect files, or summarize docs while the main coding thread keeps moving.
Source: www.reddit.com/r/ClaudeAI/comments/1tgqnsl/11_claude_thiRead original →
50radar
PHOTO
FIG-0031:1

`SmallCode` hits 87/100 coding-agent tasks with an active 4B model

// related

GitHub adds REST API auditing for `Copilot` cloud agent repo config

`Copilot Spaces API` is now generally available

11 Claude Habits That Compound Over Daily Use