← All Agents & tools

Agents & tools

50 items

FILTER[All][Agents & tools][Models & API][Generative media][Infra & SaaS][ASO & growth][Indie business][Idea signals][Other][★6+ high-signal]

clear filters

Today11 dispatches

#0050
#0050Agents & tools GeekNews4 hours ago
GitHub confirms 3,800 repositories compromised via malicious `VS Code` extension
60radar
A single developer workstation became the entry point. VS Code extension trust is now part of supply-chain security, so extension audits are worth doing now.
- About 3,800 internal repositories were affected after one employee installed a trojanized VS Code extension.
- GitHub’s current assessment limits exposure to internal repositories, but compromised developer endpoints can still leak secrets and code context.
- The extension was removed from VS Code Marketplace, infected endpoints were isolated, and incident response started immediately.
- Practical takeaway: review installed IDE extensions, publisher names, permissions, and disable unused tools before they become build-chain risk.
Source: news.hada.io/topic?id=29731Read original →
FIG-0501:1
60radar
FIG-0501:1
#0049
#0049Agents & tools Claude Code Releases5 hours ago
`Claude Code` `v2.1.146` tightens code review and background sessions
50radar
Claude CodeTerminal coding agent — automates code edits with Claude
Small release, but it removes several annoying agent-run failures. Windows, MCP pagination, and multi-agent env handling all get more reliable.
- /simplify is now /code-review with optional effort levels like high, making review intent clearer in repeatable workflows.
- MCP resources/list, resources/templates/list, and prompts/list no longer drop results after page 1. Tooling backed by large MCP servers becomes safer.
- Windows fixes cover pwsh launch failures, terminal strobing, NTFS junction cleanup, and GNOME paste behavior. Cross-platform CLI friction drops.
- CLAUDE_CODE_SUBAGENT_MODEL now reaches child processes in multi-agent sessions. Model routing gets less brittle for delegated coding runs.
- Auto-updater retries transient network failures, and large diff rendering is faster. Not flashy, but daily-use reliability improved.
Source: github.com/anthropics/claude-code/releases/tag/v2.1.146Read original →
FIG-0491:1
50radar
FIG-0491:1
#0048
#0048Agents & tools GeekNews6 hours ago
Google Cloud revamps agent development with `Antigravity 2.0`
70radar
AntigravityAgent dev tool — links local prototypes to cloud execution
Google is packaging local prototyping and cloud deployment into one agent stack. If Managed Agents API removes hosting glue, it is worth tracking now.
- Antigravity 2.0 and Managed Agents API are framed as an integrated dev kit, not separate demos.
- The flow targets local prototyping first, then managed cloud execution. Less custom orchestration if the API is usable.
- The available text is short, so pricing, lock-in, and runtime limits remain unknown. Treat it as watchlist, not migration trigger.
Source: news.hada.io/topic?id=29718Read original →
FIG-0481:1
70radar
FIG-0481:1
#0047
#0047Agents & tools r/ClaudeAI7 hours ago
Rules for running phone-first vibe coding with `Claude Code`
50radar
Claude CodeAI coding agent — automates terminal-based code edits
The useful part is the operating system: plan review, scoped chunks, commits, tests, and backups. Treat agents like junior implementers with guardrails, not magic.
- Plan mode is the control point. Bad decisions compound, so unclear sections should be challenged before code changes begin.
- If a plan cannot fit in your head, shrink the job. Smaller chunks reduce review burden and make rollback cleaner.
- After each completed plan, commit with git. It creates a code rollback point, but does not cover database state.
- Test cases should be readable in the plan: positives, negatives, missing inputs, and regressions before trusting generated code.
- For complex changes, use subagents for plan critique, security review, and testing audit; DB work needs backups first.
Source: www.reddit.com/r/ClaudeAI/comments/1tj2i90/im_a_softwareRead original →
50radar
PHOTO
FIG-0471:1
#0046
#0046Agents & tools opencode_releases9 hours ago
`opencode` `v1.15.6` adds TUI diff review and shell mode
70radar
opencodeOpen-source coding agent CLI — TUI-first workflow
Change review now happens inside the TUI, and run prompts can drop into shell mode. This is a practical upgrade for terminal-first agent workflows.
- The TUI adds a diff viewer with auto-focus on the first file and collapsed single-child directories, cutting review friction before accepting edits.
- Run now gets shell mode and replaces subagent tabs with an on-demand picker, making agent sessions less crowded during longer tasks.
- Plugin failures are better isolated: file load errors and missing tool args no longer break the rest of plugin loading.
- v2 HTTP API now exposes structured public error schemas and preserves endpoint error responses in the OpenAPI spec.
Source: github.com/anomalyco/opencode/releases/tag/v1.15.6Read original →
FIG-0461:1
70radar
FIG-0461:1
#0045
#0045Agents & tools Google AI Forum9 hours ago
`Antigravity 2.0` upgrade breaks IDE workflow for existing users
60radar
AntigravityCoding agent IDE — Google’s agent-first dev tool
A forced product split turned one workflow into IDE, agent-only app, and CLI. Co-install bugs, session hijacking, blank marketplaces, and faster credit burn make this upgrade risky.
- The old app was split into Antigravity IDE, agent-first Antigravity, and Antigravity CLI; users were pushed into the agent-only path instead of the matching IDE upgrade.
- Antigravity 2.0 and Antigravity IDE 2.0 reportedly cannot coexist, creating a packaging failure that blocks normal migration.
- Session hijacking prevents the IDE from opening after install, while Gemini support gives 1.x-era fixes and can burn credits before solving anything.
- Marketplace loading can fail after install because requests trigger rate limits, leaving extensions blank or throwing unknown errors.
Source: discuss.ai.google.dev/t/my-antigravity-is-broken-the-2-0Read original →
FIG-0451:1
60radar
FIG-0451:1
#0044
#0044Agents & tools GeekNews9 hours ago
`Gemini CLI` will stop working on June 18, 2026
80radar
Gemini CLITerminal AI CLI — runs Gemini from the command line
Google is folding terminal AI workflows into Antigravity CLI. A popular CLI with 100k+ GitHub stars now has a hard migration deadline, so scripts and habits need cleanup soon.
- Gemini CLI grew to millions of users, 100k+ GitHub stars, and 6,000+ merged PRs; this is not a small side-tool shutdown.
- Google is consolidating capability into Antigravity CLI, pointing its agent tooling toward multi-agent workflows rather than a standalone Gemini terminal client.
- The deadline is June 18, 2026. Any local aliases, CI helpers, docs, or onboarding snippets using Gemini CLI should be replaced before then.
Source: news.hada.io/topic?id=29711Read original →
FIG-0441:1
80radar
FIG-0441:1
#0043
#0043Agents & tools GeekNews9 hours ago
Better generated branch names with `jj`
40radar
jjGit-compatible VCS — change-based workflow with anonymous branches
Default push branch names are change-ID centric and awkward in CLI flows. A naming tweak can make Git interop cleaner; useful if jj is already in your workflow.
- jj encourages anonymous branches, but pushing to a Git repo still needs a bookmark, effectively a Git branch name.
- The default jj git push --change xyz creates names like push-xyz; machine-friendly, human-hostile in day-to-day CLI work.
- Better generated names reduce friction around PRs, remote branches, and cleanup. Low impact unless your Git workflow already runs through jj.
Source: news.hada.io/topic?id=29710Read original →
FIG-0431:1
40radar
FIG-0431:1
#0042
#0042Agents & tools Google AI Forum12 hours ago
`Google Antigravity` shifts toward weekly quotas, hurting long coding sessions
60radar
Google AntigravityAI coding agent — developer tool powered by Gemini
Short-reset Flash access is gone, pushing heavy use into 7-day quotas. The tool now fits burst work better than daily agentic coding.
- Gemini 3.0 Flash with roughly 5-hour resets was the practical daily driver; its removal breaks predictable iteration loops.
- Paid Ultra users are also hitting tighter limits, so this is not just a free-tier downgrade.
- Weekly cooldowns turn debugging, refactoring, review, and test loops into prompt budgeting. Keep a fallback agent ready.
Source: discuss.ai.google.dev/t/google-antigravity-has-come-to-aRead original →
FIG-0421:1
60radar
FIG-0421:1
#0041
#0041Agents & tools Simon Willison15 hours ago
`Gemini Spark`, Google’s hosted agent tied to Workspace apps
60radar
Gemini SparkHosted AI agent — native Google app connections
Google is packaging app-connected agents around Workspace, but much is still coming soon. Security and credential handling decide whether it becomes useful or risky.
- Gemini Spark connects natively to Gmail, Calendar, Drive, Docs, Sheets, Slides, YouTube, and Maps. That makes it closer to a work agent than a chat UI.
- FAQ says it runs on Gemini 3.5 Flash and Antigravity. The Antigravity stack spans a desktop app, CLI, SDK, and VS Code fork.
- Enterprise notes mention fresh isolated ephemeral VMs, Agent Gateway DLP, and encrypted credentials. That is the right threat model, not proof it is solved.
- Since the product is not broadly testable yet, treat it as a roadmap signal. Do not build critical automations around it until GA behavior is clear.
Source: simonwillison.net/2026/May/20/google-io/#atom-everythingRead original →
60radar
PHOTO
FIG-0411:1
#0040
#0040Agents & tools GitHub Changelog15 hours ago
`GitHub Copilot` in VS Code gets task-based auto model routing
70radar
Model choice moves from manual picking to routing by task, utilization, and health. Useful for lower-friction coding, but less control over exact model behavior.
- GitHub Copilot now selects a model using task fit, utilization, and model health metrics, aiming for reliable and token-efficient runs.
- The change matters most in VS Code, where model switching interrupts small coding loops; default auto mode should reduce that friction.
- Tradeoff: less explicit model control. Keep manual model selection for debugging, refactors, or prompts where output variance matters.
Source: github.blog/changelog/2026-05-20-auto-model-selection-noRead original →
FIG-0401:1
70radar
FIG-0401:1

Yesterday20 dispatches

#0039
#0039Agents & tools GitHub Changelog16 hours ago
`GitHub Copilot Chat` Adds Semantic Issue Search
80radar
Natural-language issue triage now works inside web chat. It reduces manual filtering across noisy repos and is worth trying for backlog cleanup.
- Queries can find, group, and analyze issues with a semantic issue index, so exact-label hygiene matters less.
- The feature is available in GitHub Copilot Chat on the web, keeping triage inside the existing GitHub workflow.
- Best fit: duplicate detection, bug clustering, and release-scope checks before planning a small sprint.
Source: github.blog/changelog/2026-05-20-semantic-issue-search-iRead original →
FIG-0391:1
80radar
FIG-0391:1
#0038
#0038Agents & tools GeekNews17 hours ago
`Codex Relay` Adds Mobile Terminal, Browser, Git, File Viewer, and Markdown for Codex
50radar
Codex RelayMobile Codex companion — adds terminal, Git, and file viewer
A free OSS companion fills gaps around mobile Codex use. The overlap with official remote access caps urgency, but the extra tools make it worth a quick trial.
- Includes Terminal, Browser, Git, File Viewer, and Markdown in one mobile-focused Codex companion.
- Official Codex Remote already covers the core use case, so this is a convenience layer rather than a must-migrate tool.
- Open-source and free lowers trial cost; check auth, repo access, and maintenance before using it on private work.
Source: news.hada.io/topic?id=29706Read original →
FIG-0381:1
50radar
FIG-0381:1
#0037
#0037Agents & tools Google AI Forum20 hours ago
`Antigravity` users push back on hidden compute quotas
60radar
AntigravityAI coding IDE — agent workflow powered by Google models
Paid usage turned unpredictable after quota accounting moved from requests to hidden compute. A cheaper plan can now fail mid-workflow; budget risk matters more than model choice.
- A $20 Pro user reports 2-3 weeks of HTTP 429 lockouts despite visible quota remaining; reliability became the real blocker.
- The May 19 change replaced request limits with hidden compute-used accounting, making agent background scans and micro-queries harder to budget.
- Gemini 3.5 Flash is described as more verbose and weaker at coding, burning quota through long explanations instead of useful edits.
- Upgrade pressure jumps to $100-$200 Ultra or a reported 5-day ban after quota exhaustion; keep fallback IDE and agent paths ready.
Source: discuss.ai.google.dev/t/how-antigravity-became-cursor-2-Read original →
FIG-0371:1
60radar
FIG-0371:1
#0036
#0036Agents & tools Google AI Forum21 hours ago
`Antigravity IDE` 2.0.1 macOS report: fatal DI crash disables agents and marketplace
60radar
Antigravity IDEAI coding IDE — built-in Google agent workflows
A clean install can still hit aae depends on UNKNOWN service agentSessions. Treat the 2.0 update as risky on macOS until a fix lands.
- Environment is specific: Antigravity IDE 2.0.1, VSCode OSS 1.107.0, macOS Darwin arm64 25.5.0 on Apple M4 Max.
- Core failure is dependency injection: aae cannot resolve agentSessions, so the AI Agent Manager never starts.
- Marketplace also breaks with open-vsx.org extension manifest fetches returning 429, leaving plugins unavailable.
- Cache wipes across ~/Library/Application Support/Antigravity IDE, ~/.antigravity, and cache folders did not recover it.
Source: discuss.ai.google.dev/t/bug-fatal-di-crash-on-clean-instRead original →
FIG-0361:1
60radar
FIG-0361:1
#0035
#0035Agents & tools Google AI Forum21 hours ago
`Antigravity 2.0` Windows update can hijack the IDE launcher
60radar
AntigravityAI coding IDE — Google-backed agentic dev tool
Default install places the new app.asar in the old IDE folder. Rename it to recover the IDE, then copy config folders back; useful if your setup vanished after updating.
- Electron loads resources by directory, so resources\app.asar from 2.0 takes over the original IDE executable when both land in the same install path.
- Rollback is simple: rename app.asar to app.asar.bak under %LOCALAPPDATA%\Programs\Antigravity\resources; restore the name to switch back to 2.0.
- Settings split because product names differ: Antigravity keeps old config, while restored Antigravity IDE creates empty Roaming and extension folders.
- Recover by copying Roaming\Antigravity into Roaming\Antigravity IDE, and .antigravity into .antigravity-ide; use mklink /J if extension paths exceed Windows limits.
Source: discuss.ai.google.dev/t/fix-for-antigravity-2-0-hijackinRead original →
FIG-0351:1
60radar
FIG-0351:1
#0034
#0034Agents & tools Google AI Forum23 hours ago
`Antigravity` backlash: editor removed, agent command center takes over
60radar
AntigravityAI coding IDE — Google’s agent-first dev tool
The update shifts from a full IDE into an agent hub. Losing file-first editing weakens it as a daily driver; worth watching before switching workflows.
- The core complaint is concrete: files, editor, terminal, and change tracking no longer feel like one controlled workspace.
- A CLI is framed as a poor replacement for a full IDE. For app debugging, terminal-first flow adds friction.
- Cursor and Windsurf were named as competitors that benefit if Antigravity drops its integrated editor.
- Only 5 posts from 3 participants, so this is an early friction signal, not broad market proof.
Source: discuss.ai.google.dev/t/you-did-not-upgrade-antigravity-Read original →
FIG-0341:1
60radar
FIG-0341:1
#0033
#0033Agents & tools GeekNewsyesterday
GitHub Internal Repos Accessed After Employee Device Compromise
40radar
A poisoned VS Code extension became the entry point. Treat editor extensions as supply-chain risk, not convenienceware.
- Attack path: a compromised employee endpoint via malicious VS Code extension, followed by access to internal repositories.
- GitHub removed the malicious extension version and isolated the endpoint. Extension version pinning and review matter for dev machines.
- No concrete customer-impact detail is available in the provided text. Actionable takeaway stays limited to workstation hardening.
Source: news.hada.io/topic?id=29703Read original →
FIG-0331:1
40radar
FIG-0331:1
#0032
#0032Agents & tools Cline Releasesyesterday
`Cline CLI` `v3.0.9` speeds up plugin startup and config toggles
50radar
ClineCoding agent CLI — plugin-based automation support
Plugin-heavy CLI sessions start faster. Optimistic TUI updates and cached tool descriptors reduce friction, worth updating if Cline CLI is in daily use.
- Sandboxed plugins now load concurrently, with tool descriptors cached per plugin, provider, and model. Startup latency should drop most in plugin-heavy setups.
- Plugin and tool config toggles update the TUI optimistically and persist without full config reloads. Less waiting while switching tools on and off.
- The @ mention file picker restores fuzzy ranking, so relevant files surface first again. Small fix, but it cuts prompt setup friction.
- Cancelled tasks no longer tear down the interactive session, and abort cleanup failures no longer crash the runtime host.
Source: github.com/cline/cline/releases/tag/cli-v3.0.9Read original →
FIG-0321:1
50radar
FIG-0321:1
#0031
#0031Agents & tools r/ClaudeAIyesterday
Claude Code workflow bottleneck: automate `Connect`, `Encode`, `Teach`, `Parallelize`
50radar
The bottleneck shifts from typing code to spotting repeated friction. A weekly friction log can turn small annoyances into scripts, skills, MCP connectors, or parallel agent runs.
- Connect covers copy-paste between tools; the fix is giving the agent source access through an MCP server or CLI.
- Encode targets repeated step sequences. Turn recurring deploy, debug, or cleanup flows into scripts or reusable skills.
- Teach means repeated context is leaking into prompts. Move durable instructions into CLAUDE.md or a skill.
- Parallelize is the strongest claim: watching one agent run wastes attention, so multiple sessions beat one supervised session.
Source: www.reddit.com/r/ClaudeAI/comments/1ti8cwr/after_a_year_Read original →
50radar
PHOTO
FIG-0311:1
#0030
#0030Agents & tools GeekNewsyesterday
`Cursor Composer 2.5` becomes Cursor's most-selected model, with **10x** usage bonus
80radar
CursorAI coding IDE — agentic code writing and edits
The in-house coding model is overtaking third-party defaults inside the IDE. Test it today while the usage cap is temporarily loose.
- CEO Michael Truell said Composer 2.5 became the most-selected model in Cursor; adoption moved fast right after launch.
- All users get 10x usage for one day, making it a low-cost window for real project testing instead of toy prompts.
- This is pressure on Claude and OpenAI workflows: IDE-native models win when latency, quota, and UX beat raw benchmark trust.
Source: news.hada.io/topic?id=29691Read original →
FIG-0301:1
80radar
FIG-0301:1
#0029
#0029Agents & tools GeekNewsyesterday
`Mirage`, a unified virtual filesystem for AI agents
70radar
MirageVirtual filesystem for AI agents — mounts SaaS as one tree
Different SaaS backends become one filesystem tree. Agents can use Unix tools instead of learning every SDK or MCP, so cross-service automation gets simpler.
- Mounts S3, Google Drive, Slack, Gmail, and Redis into one filesystem tree — fewer integration surfaces for agents.
- Agents can work through Unix-style bash tools, reducing the need to teach each agent service-specific SDKs or MCP interfaces.
- Cross-service pipelines become file operations. Moving data between storage, chat, mail, and cache can be scripted more cheaply.
Source: news.hada.io/topic?id=29681Read original →
FIG-0291:1
70radar
FIG-0291:1
#0028
#0028Agents & tools Cursor Changelogyesterday
`Cursor Automations` Now Works Inside the Agents Window
80radar
CursorAI code editor — built-in agentic development workflows
Scheduled agent work is moving closer to the main coding surface. Multi-repo and repo-less runs make Cursor more useful for maintenance, audits, and non-code ops.
- Automations now appears in the Agents Window, reducing the gap between scheduled jobs and active agent work.
- A single automation can attach multiple repos, useful for cross-repo refactors, dependency checks, and shared package updates.
- Repo-less automations expand the use case beyond codebases: reminders, issue triage, release notes, or research queues can run without a project attached.
Source: cursor.com/changelog/05-20-26Read original →
FIG-0281:1
80radar
FIG-0281:1
#0027
#0027Agents & tools zed_blogyesterday
`Zed` Adds Terminal Threads for Coding Agents
70radar
ZedCode editor — fast collaborative AI workflow focus
Terminal agents can now live as sidebar threads instead of loose shell sessions. Useful if you run Claude Code or Amp beside code all day.
- Claude Code, Amp, and other terminal agents can run as threads in Zed's sidebar; agent work becomes easier to revisit and separate.
- The feature turns terminal-agent sessions into IDE-native context, reducing tab/window juggling during multi-step refactors or bug hunts.
- This is a workflow upgrade, not a new model capability. Worth trying if Zed is already your editor or agent cockpit.
Source: zed.dev/blog/terminal-threadsRead original →
FIG-0271:1
70radar
FIG-0271:1
#0026
#0026Agents & tools GitHub Changelogyesterday
`GitHub Copilot` Code Review Adds `Fix with Copilot` Dialog
60radar
Review suggestions now open with more control before applying changes. It reduces PR cleanup friction, but the impact stays tactical unless the agent handles multi-file edits well.
- Implement suggestion is now Fix with Copilot; the naming pushes code review fixes into the cloud-agent workflow.
- A new UI dialog adds control over how suggestions are applied, useful when review comments are too risky for one-click patches.
- Best fit is small PR cleanup: lint fixes, narrow refactors, and reviewer nits. Architectural feedback still needs manual judgment.
Source: github.blog/changelog/2026-05-19-easily-apply-copilot-coRead original →
FIG-0261:1
60radar
FIG-0261:1
#0025
#0025Agents & tools Claude Code Releasesyesterday
`Claude Code` `v2.1.145` adds JSON session listing and richer agent telemetry
70radar
Claude CodeCoding agent CLI — automates code work with Claude in terminal
Live sessions are now scriptable via claude agents --json. Better OTEL parenting, PR-aware status lines, and safer Bash approval make multi-agent workflows easier to monitor.
- claude agents --json exposes live sessions for tmux restore, status bars, and custom session pickers — useful for long-running local agent setups.
- agent_id and parent_agent_id landed in OTEL spans, with fixed trace parenting so background subagents nest under the dispatching Agent span.
- Status-line JSON now includes detected GitHub repo and PR data, tightening the loop between CLI work and pull-request state.
- A Bash permission bypass for bare non-allowlisted env assignments was fixed — upgrade promptly if command approval boundaries matter.
- Plugin discovery now shows commands, agents, skills, hooks, and MCP/LSP servers before install, reducing blind marketplace installs.
Source: github.com/anthropics/claude-code/releases/tag/v2.1.145Read original →
FIG-0251:1
70radar
FIG-0251:1
#0024
#0024Agents & tools Google AI Forumyesterday
`Gemini Code Assist` sunset points to `Antigravity CLI` migration
60radar
Antigravity CLICoding agent CLI — replacement path for Code Assist
The June 18 cutoff turns PR review automation into a migration task. Broken docs raise execution risk, so audit hooks and CI usage before relying on it.
- Docs put the cutoff at June 18. Any Gemini Code Assist PR-review flow needs a replacement path before then.
- Antigravity CLI is positioned as the follow-up tool, shifting review automation from hosted assist to CLI-driven workflow.
- Broken documentation links are a real adoption risk. Budget time for setup friction instead of treating this as a drop-in swap.
Source: discuss.ai.google.dev/t/gemini-code-assist-replaced-withRead original →
FIG-0241:1
60radar
FIG-0241:1
#0023
#0023Agents & tools Simon Willisonyesterday
`llm-gemini` `0.32a0` adds reasoning-token streaming
40radar
llm-geminiLLM CLI plugin — runs Gemini models from `llm`
Gemini reasoning output can now stream through the llm CLI alpha path. Small alpha release, but useful when inspecting long reasoning traces live.
- Requires llm>=0.32a0, so this is tied to the alpha CLI line rather than the stable release.
- The new behavior streams reasoning tokens, reducing the blind wait while Gemini works through longer prompts.
- Scope is narrow: no pricing, model, or workflow change. Worth testing only if llm is already in your CLI stack.
Source: simonwillison.net/2026/May/19/llm-gemini/#atom-everythinRead original →
40radar
PHOTO
FIG-0231:1
#0022
#0022Agents & tools GitHub Changelog2 days ago
`Gemini 3.5 Flash` is now GA in `GitHub Copilot`
80radar
A faster, cheaper coding model option is landing in the IDE. Near-Pro quality at Flash-tier cost makes it worth testing for routine implementation loops.
- GitHub Copilot is rolling out Google’s latest Flash-tier model as a generally available option, not a preview-only experiment.
- Early testing claims near-Pro coding quality with Flash-tier speed and cost, useful for high-volume autocomplete and edit cycles.
- The practical play is model routing: keep premium models for hard design calls, use Gemini 3.5 Flash for repetitive coding throughput.
Source: github.blog/changelog/2026-05-19-gemini-3-5-flash-is-genRead original →
FIG-0221:1
80radar
FIG-0221:1
#0021
#0021Agents & tools Cline Releases2 days ago
`Cline CLI` `v3.0.8` fixes plugin diagnostics, Bedrock setup, and token counts
50radar
ClineOpen-source coding agent — runs agent workflows in IDE and CLI
This is a maintenance release, but the fixes hit real workflow costs. Cleaner broken-plugin diagnostics and accurate token accounting make local agent setups easier to trust.
- Failed plugins now stay visible in the config UI with load/setup phase and error details, so broken definitions are faster to debug.
- AgentRuntime.execute() now resets usage between calls, fixing inflated token counts from local runtime double-counting.
- AWS Bedrock onboarding now detects region/profile correctly and exposes bearer-token plus extra Bedrock config fields.
- Create Session Fork moved from Opt+F to Opt+R, restoring terminal word-right navigation.
Source: github.com/cline/cline/releases/tag/cli-v3.0.8Read original →
FIG-0211:1
50radar
FIG-0211:1
#0020
#0020Agents & tools Cline Releases2 days ago
`Cline` `v3.84.0` adds SAP AI Core hosted model support
40radar
ClineVS Code coding agent — MCP and multi-model support
More hosted model options landed, but this is a narrow integration release. Useful only if your workflow already touches SAP AI Core.
- SAP AI Core support expands hosted model choices inside Cline; direct value is limited outside SAP-backed environments.
- The MCP Restart Server button is disabled when a server is toggled off, reducing accidental server actions in agent setups.
- The startup flow drops the Cline Kanban launch modal and bundled demo media, making the VS Code extension open cleaner.
Source: github.com/cline/cline/releases/tag/v3.84.0Read original →
FIG-0201:1
40radar
FIG-0201:1

Tue, May 1919 dispatches

#0019
#0019Agents & tools GeekNews2 days ago
`Goal Setter`, an agent skill for writing safer `Codex` goals
50radar
Goal SetterCodex agent skill — interviews users to define done states
Long-running agent work now needs sharper stop conditions. This skill turns vague requests into explicit done states before Goal burns time and tokens.
- Goal Setter interviews the user before creating a goal, reducing drift in long Codex runs.
- The core check is what exact state counts as done. Without that, Goal can waste tokens fast.
- Useful for large refactors, test work, or migration tasks where the agent needs persistence but clear boundaries.
Source: news.hada.io/topic?id=29661Read original →
FIG-0191:1
50radar
FIG-0191:1
#0018
#0018Agents & tools r/LocalLLaMA2 days ago
Agent Shell Access Hit the `rm -rf /` Failure Mode
40radar
An agent tried rm -rf / while testing a shell-command block. The block worked, but sandboxing must come before shell access.
- The whitelist blocked the harmful command, so damage was zero, aside from operational panic.
- bubblewrap isolation came after the whitelist; that ordering is backward for any agent with shell execution.
- Command allowlists help, but they are a second layer. Filesystem isolation and disposable workspaces should be default.
Source: www.reddit.com/r/LocalLLaMA/comments/1thosnt/got_my_firsRead original →
40radar
PHOTO
FIG-0181:1
#0017
#0017Agents & tools vercel_blog2 days ago
`Nuxt MCP Toolkit` adds support for MCP apps
70radar
Nuxt MCP ToolkitMCP toolkit for Nuxt — Vue SFC-based tool UIs
Agent tools can now return inline interactive HTML, not just text. Useful for richer tool flows inside Claude or ChatGPT.
- Tools declared with defineMcpApp can render interactive HTML responses in MCP clients such as Claude and ChatGPT.
- useMcpApp lets the UI read pre-hydrated data, trigger follow-up prompts, or call other tools from inside the response.
- Vue SFCs are bundled into self-contained HTML at build time and served from the MCP endpoint, reducing custom UI plumbing.
Source: vercel.com/changelog/nuxt-mcp-toolkit-mcp-appsRead original →
FIG-0171:1
70radar
FIG-0171:1
#0016
#0016Agents & tools cloudflare_ai2 days ago
`Claude Managed Agents` Run on `Cloudflare`
80radar
Claude Managed AgentsCoding agent platform — isolated execution and custom tools
Agent workflows get isolated, globally distributed execution without opening private backends too widely. Useful once coding agents move from local experiments to repeatable delivery pipelines.
- Cloudflare provides a fast, isolated execution environment for autonomous code delivery, reducing the need to run agent workers on your own servers.
- Access control over private backends is the practical hook. Agents can operate near production systems without broad credentials floating around.
- Custom tools and runtimes are supported, so Claude Managed Agents can fit repo-specific deploy, test, and data workflows.
Source: blog.cloudflare.com/claude-managed-agents/Read original →
FIG-0161:1
80radar
FIG-0161:1
#0015
#0015Agents & tools Hacker News · AI2 days ago
`Forge` Pushes Local 8B Agent Reliability Near Frontier APIs
70radar
ForgeLLM guardrail runtime — improves local tool-call reliability
Guardrails, not bigger weights, drive the jump. The useful takeaway is architectural: retries, recovery, and serving backend choice can matter more than model size.
- Ministral 8B with Forge hit 99.3%, versus Claude Sonnet with guardrails at 100% across the reported eval setup.
- Without retry nudges, scores dropped 24-49 points. Reliability work belongs in the agent runtime, not only in model selection.
- Serving backend changed the same Mistral-Nemo 12B weights from 7% on llama-server native function calling to 83% on Llamafile prompt mode.
- Error recovery scored 0% for every tested model without retry logic. Tool agents need explicit recovery paths before production use.
Source: github.com/antoinezambelli/forgeRead original →
FIG-0151:1
70radar
FIG-0151:1
#0014
#0014Agents & tools Hacker News · LLM2 days ago
`Forge` pushes local LLM tool-calling reliability with guardrail retries
70radar
ForgeLLM tool-calling layer — guardrails for local models
Guardrails, not model size, drive most of the gain. Useful if you want always-on agents without frontier API spend.
- Ministral 8B reached 99.3% with Forge; Claude Sonnet with the same layer hit 100%.
- Without guardrails, Claude Sonnet scored 87.2%, so orchestration beat raw model strength in this eval.
- Retry nudges caused 24-49 point drops when removed; error recovery added about 10 points across tested models.
- Backend choice changed results hard: the same Mistral-Nemo 12B weights scored 7% on llama-server vs 83% on Llamafile.
Source: github.com/antoinezambelli/forgeRead original →
FIG-0141:1
70radar
FIG-0141:1
#0013
#0013Agents & tools Hacker News · MRR2 days ago
`Forge` raises local 8B agent task success from 53% to 99% with guardrails
70radar
ForgeLLM tool-calling guardrails — retries and recovery for local models
Reliability came from orchestration, not a bigger model. Forge makes local tool-calling viable when cloud agent costs are the bottleneck.
- Ministral 8B with Forge hit 99.3% across multi-step workflows; Claude Sonnet with the same guardrails reached 100%.
- Without retry handling, error recovery scored 0% across every tested local and frontier model. The missing layer is architectural.
- Backend choice changed results sharply: the same Mistral-Nemo 12B weights scored 7% on llama-server native calling and 83% on Llamafile prompt mode.
- Ablation points to retry nudges and error recovery as the useful parts. Rescue parsing and context compaction stayed for rare production failures.
Source: github.com/antoinezambelli/forgeRead original →
FIG-0131:1
70radar
FIG-0131:1
#0012
#0012Agents & tools Hacker News · AI Agent2 days ago
`Forge` brings local 8B agent workflows near frontier reliability
70radar
ForgeLLM guardrail layer — improves local tool-calling reliability
Guardrails, not bigger weights, drive the result. Retry nudges and error recovery make local always-on agents cheaper to test now.
- Ministral 8B reached 99.3% with Forge; Claude Sonnet with guardrails hit 100%, leaving less than 1 point between local and frontier.
- Without guardrails, the same comparison flips: local 8B plus framework support beat unguarded Claude Sonnet at 87.2%.
- Ablations put most value in retry nudges and error recovery. Disabling retry nudges caused 24-49 point drops.
- Serving backend changed outcomes sharply: Mistral-Nemo 12B scored 7% on llama-server native function calling vs 83% on Llamafile prompt mode.
Source: github.com/antoinezambelli/forgeRead original →
FIG-0121:1
70radar
FIG-0121:1
#0011
#0011Agents & tools Hacker News · Show HN AI2 days ago
`Forge` adds reproducible guardrails for local LLM agents
70radar
ForgeLocal LLM reliability layer — retry and recovery guardrails
Local tool-calling reliability is framed as a system problem, not a model-size problem. If the evals hold, always-on agents get much cheaper.
- Ministral 8B reached 99.3% with guardrails; Claude Sonnet with the same layer hit 100%.
- Without retry handling, error recovery scored 0% across local and frontier models. The missing piece is architecture.
- Ablations put most lift on retry nudges and error recovery; context compaction helped less in the benchmark.
- Serving backend changed Mistral-Nemo 12B from 7% to 83% accuracy, so deployment stack is part of model quality.
Source: github.com/antoinezambelli/forgeRead original →
FIG-0111:1
70radar
FIG-0111:1
#0010
#0010Agents & tools r/ClaudeAI2 days ago
Anthropic acquires `Stainless`, the major MCP server generator
80radar
StainlessSDK generation platform — builds SDKs and MCP servers from OpenAPI
The strongest OpenAPI-to-MCP pipeline is now closed to new users. Better standard templates are likely, but vendor concentration just became a real stack risk.
- Stainless generated official SDKs for OpenAI, Google, Meta, Cloudflare, and Anthropic, then extended that compiler to MCP servers.
- MCP reached about 97M monthly SDK downloads by Dec 2025 and roughly 10,000 production servers by early 2026.
- New signups and new SDK/MCP generations stopped on Monday; existing customers keep generated code, but the pipeline is closed.
- Cloudflare's MCP framework, Pulse MCP, and open-source generators now matter more as practical alternatives to Anthropic-owned tooling.
Source: www.reddit.com/r/ClaudeAI/comments/1thkkrb/anthropic_jusRead original →
80radar
PHOTO
FIG-0101:1
#0009
#0009Agents & tools r/ClaudeAI2 days ago
100 Practical Rules for Building a Persistent Personal AI Agent
50radar
A six-week build distilled into operating rules for a real agent: constitution, identity, capability maps, and local automation. Useful as an agent design checklist, not a product update.
- Start with a constitution, not just a system prompt. It gives the agent a basis for edge cases instead of brittle command-following.
- Separate hard rules from behavioral guidelines. Mixing them makes the agent treat everything as either negotiable or frozen.
- Keep a Capability Map and Component Map apart: what it can do vs. how it is wired. That keeps Claude Code setups maintainable after month three.
- The cloud-to-local move added file access, git tracking, shell hooks, and scheduled headless tasks. Serious agents need tool surfaces, not chat only.
Source: www.reddit.com/r/ClaudeAI/comments/1thi6nh/100_tips_tricRead original →
50radar
PHOTO
FIG-0091:1
#0008
#0008Agents & tools r/ClaudeAI2 days ago
Using `Power Automate` Webhooks as an MCP Bridge for Microsoft 365
50radar
Power AutomateWorkflow automation SaaS — runs M365 connectors via webhooks
Power Automate can turn existing M365 permissions into callable agent tools without Graph admin approval. Useful for personal ops automation, but webhook hygiene is the real risk.
- Each M365 action becomes one Power Automate flow with an HTTP trigger, then a small FastMCP server exposes it as a Claude tool.
- The setup covered 22 flows: email, calendar, OneDrive notes, Planner tasks, Excel rows, and Word templates.
- Signed webhook URLs act like passwords. A duplicated URL already caused the wrong action to run, so config review matters more than code size.
Source: www.reddit.com/r/ClaudeAI/comments/1thabze/i_gave_claudeRead original →
50radar
PHOTO
FIG-0081:1
#0007
#0007Agents & tools GeekNews2 days ago
Anthropic Acquires `Stainless` to Expand Agent Tooling
50radar
StainlessAPI tooling SaaS — Generates SDKs and MCP servers
Agent value now depends on how many real systems it can reach. This is not an immediate product change, but it signals broader Claude tool integration ahead.
- Stainless builds SDK and MCP server tooling, so the acquisition targets the connection layer between APIs and agents.
- The move shifts focus from model answers to action-capable agents that can touch data, tools, and workflows.
- No pricing, release date, or developer-facing feature is included yet. Treat it as a roadmap signal, not something to adopt today.
Source: news.hada.io/topic?id=29647Read original →
FIG-0071:1
50radar
FIG-0071:1
#0006
#0006Agents & tools GeekNews2 days ago
`Project Glasswing`: What `Mythos` Demonstrated
60radar
MythosSecurity agent — proves exploit chains automatically
The bar moved from spotting suspicious code to proving a working exploit path. This is early, but it hints at security agents that can validate bugs before a human review.
- Mythos Preview ran across 50+ Cloudflare repos, linking multiple primitives into exploit chains instead of flagging isolated bugs.
- It wrote trigger code, compiled and executed temporary tests, then revised failed hypotheses. That closes the gap between static finding and proof.
- The practical signal is security automation shifting toward reproducible evidence. Expect fewer raw alerts, more agent-generated repro cases.
Source: news.hada.io/topic?id=29645Read original →
FIG-0061:1
60radar
FIG-0061:1
#0005
#0005Agents & tools GeekNews2 days ago
Using Git `--author` to Block AI Bot Spam in GitHub Repos
40radar
AI-generated PR and issue noise can bury real maintainer discussion fast. A lightweight Git identity gate is a practical abuse filter for bounty issues.
- An Archestra bounty issue reached 253 comments after AI-bot replies crowded out contributor discussion.
- The reported failure mode was not just volume: meaningless comments, PRs, and aggressive replies raised maintainer cost.
- git --author points to a cheap screening layer: filter suspicious commit identities before review time gets spent.
Source: news.hada.io/topic?id=29642Read original →
FIG-0051:1
40radar
FIG-0051:1
#0004
#0004Agents & tools Claude Code Releases2 days ago
`Claude Code` `v2.1.144` improves background sessions, MCP tools, and terminal stability
80radar
Claude CodeCoding agent CLI — automates code work with Claude in terminal
Background agents are easier to resume and diagnose. The bigger win is fewer broken long sessions: MCP pagination, terminal corruption, startup hangs, and bad image files now fail less often.
- /resume now lists sessions started with claude --bg or agent view, marked as bg; background work is easier to recover after context switching.
- Background subagent completion notifications include elapsed time like 3h 2m 5s, useful for spotting expensive or slow automation runs.
- /model now changes only the current session; press d in the picker to set defaults, reducing accidental model drift across sessions.
- MCP tools/list pagination now returns more than the first page. Tool-heavy setups should stop losing capabilities silently.
- Startup hangs when api.anthropic.com is unreachable now time out after 15s, not up to 75s; bad networks hurt less.
Source: github.com/anthropics/claude-code/releases/tag/v2.1.144Read original →
FIG-0041:1
80radar
FIG-0041:1
#0003
#0003Agents & tools GeekNews2 days ago
Using `Codex` `Goals` for Long-Running Tasks
50radar
CodexCoding agent — continues multi-turn work toward a goal
Goals keeps multi-turn work moving toward a defined outcome. Useful for profiling, patches, benchmarks, flaky tests, and evidence-based audits.
- Goals is a persistent objective for a Codex thread, so work can continue across multiple turns toward a defined result.
- Best fit is work that breaks a single prompt: profiling, patching, benchmarking, flaky test reproduction, and audits.
- The leverage comes from clear end conditions. Vague goals turn into extra turns without better output.
Source: news.hada.io/topic?id=29639Read original →
FIG-0031:1
50radar
FIG-0031:1
#0002
#0002Agents & tools Product Growth2 days ago
`Claude Code` Workflow for Non-Technical PMs
50radar
Claude CodeCoding agent CLI — automates code edits and runs in terminal
A no-code-to-agent path: start with Lovable, then move into multi-agent work in Claude Code. Useful as an adoption pattern, but thin without code, metrics, or failure cases.
- The flow starts from Lovable-style builders and ends at multi-agent systems in Claude Code; good migration framing for prototype-to-automation work.
- The target user is non-technical PMs, so the value is workflow scaffolding rather than deep engineering detail.
- No numbers, benchmarks, or concrete output quality claims are given; treat it as a light tutorial signal, not a tool launch.
Source: www.news.aakashg.com/p/claude-code-non-technical-pmsRead original →
FIG-0021:1
50radar
FIG-0021:1
#0001
#0001Agents & tools GitHub Changelog2 days ago
GitHub adds one-click Action failure fixes with `Copilot cloud agent`
70radar
Copilot cloud agentCoding agent — automates GitHub tasks in the cloud
Failed CI can now be handed to an agent from the Actions UI. Useful for paid teams, but availability is limited to Business and Enterprise.
- A failed GitHub Actions job now shows a Fix with Copilot button for one-click agent handoff.
- Access is limited to Copilot Business and Copilot Enterprise, so it is not a free GitHub workflow upgrade.
- The biggest win is CI repair latency: failed tests can move straight into an agent patch loop without opening an IDE.
Source: github.blog/changelog/2026-05-18-one-click-fixes-for-faiRead original →
FIG-0011:1
70radar
FIG-0011:1