← back archive / 2026 / 05

2026-05-02

Saturday, May 2, 2026 · 13 items · ★ avg 5.5

#0013
#0013Agents & tools r/LocalLLaMA3 weeks ago
`LDR` pushes local deep research to **95.7%** `SimpleQA` on one `RTX 3090`
60radar
LDRLocal deep research agent — 95.7% on one RTX 3090
Agentic search, not closed-book recall, is doing the heavy lifting here. A fully local stack is now close to hosted deep-research scores, so private research workflows on prosumer GPUs look practical right now.
- The stack uses Ollama, qwen3.6:27b, and langgraph_agent with tool-calling, parallel subtopic splits, and up to 50 iterations; orchestration quality matters as much as model size.
- Reported scores are 95.7% on SimpleQA and 77.0% on xbench-DeepSearch, versus 91.2% / 59.0% for Qwen3.5-9B; newer Qwen gains show up strongly in tool-heavy loops.
- This is benchmarked with search enabled, so it competes more directly with Perplexity Deep Research and Tavily than with pure closed-book QA.
- Caveats are non-trivial: small sample sizes, self-grading noise, possible SimpleQA contamination, and a Chinese-language benchmark that may favor Qwen.
- LDR also adds practical infra: journal-quality grading via OpenAlex/DOAJ, per-user SQLCipher encryption, and zero telemetry.
Source: www.reddit.com/r/LocalLLaMA/comments/1t1n6o8/we_are_finaRead original →
60radar
PHOTO
FIG-0131:1
#0012
#0012Idea signals Hacker News · Show HN AI3 weeks ago
`SimplePDF Copilot`: AI PDF form filling with client-side tool calling
60radar
SimplePDF CopilotPDF copilot — fills and edits forms in-browser
The useful twist is not PDF chat but browser-executed actions on live forms. Strong privacy framing plus BYOK/local model support makes this a solid pattern for document workflows worth copying.
- The PDF never leaves the browser; parsing, rendering, and field detection run client-side, which sharply reduces PII exposure.
- Actions go beyond retrieval: fill fields, focus inputs, add fields, and delete pages, turning the LLM into an editor operator.
- Model traffic is separable from document handling: default proxy exists, but BYOK and local setups like LM Studio are already supported.
- Distribution signal is real: the underlying editor already serves 200k+ monthly users, so this is attached to an existing workflow, not a lab demo.
Source: copilot.simplepdf.com/?share=a7d00ad073c75a75d493228e6ffRead original →
FIG-0121:1
60radar
FIG-0121:1
#0011
#0011Agents & tools Hacker News · Show HN AI3 weeks ago
`agent-desktop`: structured native desktop automation CLI for AI agents
60radar
agent-desktopCLI tool — native UI automation via accessibility trees
Desktop agents can skip pixel-click loops and operate on the accessibility tree instead. That cuts token cost by 78% to 96% in heavy apps and makes local app automation practical enough to test now.
- Instead of screenshot -> coordinate prediction -> click, it uses OS accessibility APIs on macOS, Windows, and Linux, so actions target real UI elements.
- The CLI ships as a single ~15 MB Rust binary with 53 commands and JSON output, making it easy to wire into agent loops and scripts.
- Progressive skeleton traversal avoids dumping the full UI tree; Slack-sized trees can exceed 50,000 tokens, so shallow snapshots plus subtree fetches matter.
- Element refs like @e12 and subtree re-querying let agents act, then refresh only the changed region, which is faster than full re-snapshots.
Source: github.com/lahfir/agent-desktopRead original →
FIG-0111:1
60radar
FIG-0111:1
#0010
#0010Agents & tools GitHub Changelog3 weeks ago
`GitHub Copilot` to phase out `GPT-5.2` and `GPT-5.2-Codex`
60radar
GitHub is clearing older model defaults across Copilot surfaces. If these models are baked into your workflow, switch now and recheck output quality before the cutoff.
- Scope is broad: Copilot Chat, inline edits, ask, agent modes, and code completion are all included, so this is not a niche surface change.
- There is an exception note for GPT-5.2-Codex in part of Copilot, which suggests a staged migration rather than an immediate hard stop everywhere.
- If model names live in team docs, prompts, or eval baselines, update them now and compare speed and code quality against the replacement model.
Source: github.blog/changelog/2026-05-01-upcoming-deprecation-ofRead original →
FIG-0101:1
60radar
FIG-0101:1
#0009
#0009Agents & tools r/ClaudeAI3 weeks ago
Anthropic Opens `Claude Security` Public Beta for Enterprise
60radar
Claude SecuritySecurity scanner — context-aware, self-verifying findings
Instead of pattern matching, it traces code, history, and logic, then challenges each finding before surfacing it. Promising direction, but AI-written fixes for critical systems still need strict human review.
- It targets high-severity bugs like memory corruption, injection, auth bypass, and logic flaws by reading cross-file context, not just signatures.
- Each finding goes through adversarial self-verification first, cutting the false-positive flood that makes many scanners easy to ignore.
- Every alert includes a proposed patch shaped to the existing code style, which shortens triage but does not remove review overhead.
- Delivery hooks include Slack, Jira, webhooks, scheduled scans, and directory scoping, so it fits existing security workflows.
Source: www.reddit.com/r/ClaudeAI/comments/1t12l3t/anthropic_jusRead original →
60radar
PHOTO
FIG-0091:1
#0008
#0008Agents & tools Hacker News · Show HN AI3 weeks ago
`Adam` launches in-CAD agent beta for `Fusion` and `Onshape`
60radar
AdamCAD agent — reads and edits feature trees directly
Instead of dumping out black-box STLs, it edits the existing feature tree inside the CAD tool. That makes it far more credible for real mechanical workflows; worth testing now if CAD cleanup or parametrization is a bottleneck.
- The product moved from text-to-3D demos to direct CAD integration, targeting engineers who want visibility and control over the feature tree.
- Current use cases are practical: merge redundant features, rename messy trees, add 2mm internal fillets, parametrize models, and still generate CAD end to end.
- The stack leans on FeatureScript for Onshape and Python for Fusion, which suggests a code-first approach instead of screenshot-level UI automation.
- They claim a large jump in spatial reasoning from recent frontier models and route tasks to whichever model wins internally, not a single-model stack.
Source: fusion.adam.new/installRead original →
FIG-0081:1
60radar
FIG-0081:1
#0007
#0007Idea signals r/microsaas3 weeks ago
Faceless SEO automation found its first **7 paying users** through niche community posts
50radar
A dev who avoided SEO and personal-brand marketing still got early traction by automating search visibility for builders. The signal is clear: workflow-driven, low-exposure growth tools can sell without a big launch, so this niche is worth testing.
- The pain is specific: manual keyword research, X posting, and founder-led video marketing were bad enough to trigger building a product instead.
- Early distribution was tiny but relevant: a few honest comments in developer subreddits beat ads, Product Hunt, and discount-driven launch tactics.
- The product promise is narrow and useful: automate technical SEO structure so small projects can earn search visibility without public founder content.
- The open question shifts from validation to systemization. Getting the first users was possible; building a repeatable acquisition loop is the real gap.
Source: www.reddit.com/r/microsaas/comments/1t1pjl2/launched_lasRead original →
FIG-0071:1
50radar
FIG-0071:1
#0006
#0006Other r/ClaudeAI3 weeks ago
Offload cheap work from `Claude Code` to a low-cost helper model
50radar
Stop spending premium context on bulk reads and boilerplate. Route low-value CLI work to a cheap model, keep core reasoning local, and the savings are large enough to try now.
- A Bash-called helper handled bulk file reads and boilerplate generation, while CLAUDE.md defined when to delegate versus keep work in Claude.
- Over 3 weeks, weekly Pro limit exhaustion disappeared. The pattern reduced pressure on the primary model instead of chasing prompt tweaks.
- Doc updates reportedly dropped from about 5,000 tokens to 200 tokens. Expensive context was reserved for judgment, not mechanical edits.
- Total helper-model spend was just $0.38. If your workflow burns tokens on repetitive prep work, this routing pattern is hard to ignore.
Source: www.reddit.com/r/ClaudeAI/comments/1t1o43w/i_gave_claudeRead original →
50radar
PHOTO
FIG-0061:1
#0005
#0005Agents & tools Hacker News · Show HN AI3 weeks ago
`MLJAR Studio`: local AI data analyst that turns chats into notebooks
50radar
MLJAR StudioAI data app — turns chat analysis into .ipynb
Natural-language data analysis now ends as a runnable .ipynb, not a dead-end chat. Local execution, auto environment setup, and built-in AutoML make it a practical bridge between Jupyter and cloud copilots, though $199 one-time limits impulse adoption.
- Chat-driven analysis is saved as a reproducible .ipynb; generated Python stays inspectable, editable, and rerunnable instead of disappearing in chat history.
- The app auto-creates a local Python environment and installs missing packages during the session, cutting setup friction on Mac, Windows, and Linux.
- Built-in AutoML covers tabular classification, regression, and multiclass work, so exploratory analysis and baseline modeling sit in one desktop workflow.
- Data ingress is broad: CSV, Excel, Stata, Parquet plus PostgreSQL, MySQL, SQL Server, Snowflake, Databricks, and Supabase.
- Model choice is flexible with local Ollama, user-supplied OpenAI API keys, or the vendor add-on; zero-egress local mode is the clearest differentiator.
Source: mljar.com/Read original →
FIG-0051:1
50radar
FIG-0051:1
#0004
#0004Idea signals Hacker News · MCP Server3 weeks ago
`SimplePDF Copilot`: AI PDF form filling with client-side tool calling
50radar
SimplePDF CopilotPDF AI tool — edits forms and pages in-browser
The useful part here is not another chat UI but a browser-executed action loop on top of PDFs. Keeping parsing, rendering, and field ops client-side while swapping LLM backends makes this pattern worth copying now.
- SimplePDF says it already serves 200k+ monthly users, so this is attached to an existing workflow rather than a toy demo.
- The PDF never leaves the browser; only extracted text and messages go to the model. That is a cleaner privacy split for forms, healthcare, and PII-heavy flows.
- Tools do real mutations: fill fields, add fields, focus fields, delete pages. This is closer to an in-browser agent than typical 'chat with PDF' retrieval.
- Backend choice is flexible: default proxy, BYOK to any cloud model, or local via LM Studio. That reduces lock-in and gives a reusable architecture pattern.
- The interesting implementation detail is iframe postMessage for client-side tool execution. Same structure can fit canvases, editors, dashboards, and other embedded SaaS UIs.
Source: copilot.simplepdf.com/?share=a7d00ad073c75a75d493228e6ffRead original →
FIG-0041:1
50radar
FIG-0041:1
#0003
#0003Agents & tools r/ClaudeAI3 weeks ago
`/graphify` turned from code memory into a general-purpose knowledge graph layer
50radar
/graphifyKnowledge graph tool — query code, docs, and images
Usage escaped code almost immediately: people query SQL schemas, notes, papers, meetings, and even whiteboard photos through one graph. The real product looks less like repo memory and more like a cross-format retrieval layer worth testing now.
- /graphify ingests an entire repo, builds a graph with Leiden community detection, and cuts query tokens by 71x versus raw file reads.
- Adoption spiked fast: 450k+ PyPI downloads in 26 days, about 40k GitHub stars, and a brief #2 global GitHub rank.
- Behavior shifted from code assistance to broad knowledge retrieval; users feed in schemas, Obsidian vaults, paper corpora, transcripts, and photos.
- The center of gravity moved to the /graphify query "..." flow, suggesting search and memory beat one-time code indexing as the sticky feature.
Source: www.reddit.com/r/ClaudeAI/comments/1t18eeh/i_built_graphRead original →
FIG-0031:1
50radar
FIG-0031:1
#0002
#0002Other r/ClaudeAI3 weeks ago
`Serno` pivots from AI debate toy to research canvas for cross-model verification
50radar
SernoAI research canvas — cross-checks models via debate
People used the original two-model chat to check which model was hallucinating, not for entertainment. Serno rebuilt it as a canvas where multiple models investigate, debate, and expose assumptions, making it worth trying for high-stakes research workflows.
- Users turned the earlier Roundtable chat into a hallucination-checking tool, proving cross-model verification is a real workflow pain.
- The team concluded chat is a bad UI for big questions: long threads bury context, side-by-side chats add overhead, and deep research demands too much reading.
- The new canvas mode splits a question into angles, assigns multiple models to investigate, then stages a debate so assumptions are surfaced instead of hidden in one polished answer.
- Regular chat remains for lightweight tasks, while canvas is positioned for questions where answer quality and traceability matter more than speed.
- Claude handles most of the heavy lifting, and the product is offered with free starter credits, lowering the cost to test the workflow immediately.
Source: www.reddit.com/r/ClaudeAI/comments/1t144z1/i_got_tired_oRead original →
50radar
PHOTO
FIG-0021:1
#0001
#0001Agents & tools Cline Releases3 weeks ago
`Cline` `v3.82.0`, VS Code Terminal Restore and Model List Refresh
50radar
ClineAI coding tool — strong terminal and model integration
Foreground terminal support in VS Code is back, and the model catalog now includes newer OpenAI, SAP AI Core, and Z AI options. Useful if Cline is in daily rotation, but this is a maintenance release rather than a must-upgrade moment.
- VS Code foreground terminal support and related settings were restored, removing a workflow break for terminal-heavy use.
- The release adds newer models from OpenAI, SAP AI Core, and Z AI, so provider coverage expands without manual patching.
- Hook template JSON escaping was fixed, which reduces avoidable failures in automated tool or workflow hooks.
- ripgrep file search error handling improved, so file discovery should fail more cleanly instead of derailing the session.
- Docs no longer ship hardcoded model lists, a small but good sign that model support is moving toward faster updates.
Source: github.com/cline/cline/releases/tag/v3.82.0Read original →
FIG-0011:1
50radar
FIG-0011:1