`Qwen3.7-Max`: Agent-First Proprietary Model

A proprietary model is being positioned for coding, office automation, and very long autonomous runs. Strong benchmark numbers make it worth testing for agent workflows, though API cost and access still decide adoption.

[ KEY POINTS ]

Targets coding, debugging, office automation, and hundreds to thousands of autonomous steps; this is agent runtime territory, not simple chat.
Scores 69.7 on Terminal Bench 2.0-Terminus and 92.4 on GPQA Diamond; useful signal for coding plus reasoning evals.
The reported 35-hour autonomous run matters for long workflows, but real value depends on reliability, tool use, and pricing.

Originalnews.hada.io/topic?id=29716Read original →

// related

#0001
#0001Models & API r/LocalLLaMA6 hours ago
Cohere launches `Command A+`, an Apache 2.0 MoE open-weight model
80radar
Command A+Open-weight LLM — Apache 2.0 MoE model
A practical open-weight model enters the agent stack. Apache 2.0 plus strong quantization makes local or self-hosted experiments cheaper to justify.
- Command A+ is Cohere’s first MoE model; top-line performance still needs work, but speed and responsiveness are the claimed edge.
- The model is released under Apache 2.0, so commercial use and product integration have fewer license traps.
- Quantization is positioned as a core feature: it runs well on 1-2 GPUs, making self-hosted agent backends more realistic.
- Cohere frames it as the kind of model behind its enterprise agents, not just a benchmark artifact.
Source: www.reddit.com/r/LocalLLaMA/comments/1tizmar/re_what_eveRead original →
FIG-0011:1
80radar
FIG-0011:1
#0002
#0002Models & API r/LocalLLaMA20 hours ago
`Qwen3.7 Max` hits 5th on Artificial Analysis; 27B/35B still pending
60radar
Qwen3.7 MaxLarge language model — high-end Alibaba Qwen variant
Artificial Analysis puts it near GPT 5.4 xhigh and above Gemini 3.5 Flash. Strong benchmark signal, but migration waits on API price and smaller-model results.
- Ranked 5th on Artificial Analysis, roughly tied with GPT 5.4 xhigh; credible enough to add to model eval lists.
- Gemini 3.5 Flash sits one step lower in the cited ranking, so latency/price will decide the practical winner.
- Qwen3.6 27B trails Max by 6 points; the 27B/35B Qwen3.7 results matter for local or cheaper deployment.
Source: www.reddit.com/r/LocalLLaMA/comments/1tie6gy/qwen37_max_Read original →
60radar
PHOTO
FIG-0021:1
#0003
#0003Models & API vercel_blog21 hours ago
`Grok Build 0.1` lands on `Vercel AI Gateway`
60radar
Grok Build 0.1Agentic coding model — powers the Grok Build CLI
A beta agentic coding model is now callable through the AI SDK. Useful for quick experiments, but early access and fixed reasoning keep it from being a default choice.
- Set the model to xai/grok-build-0.1 in the AI SDK; integration cost is low if the app already uses AI Gateway.
- Reasoning effort is not configurable, and there is no non-reasoning mode. Latency and cost controls are limited for production agents.
- AI Gateway adds usage tracking, cost reporting, retries, failover, BYOK, and provider routing around the model call.
Source: vercel.com/changelog/grok-build-0-1-now-available-on-verRead original →
FIG-0031:1
60radar
FIG-0031:1

`Qwen3.7-Max`: Agent-First Proprietary Model

// related

Cohere launches `Command A+`, an Apache 2.0 MoE open-weight model

`Qwen3.7 Max` hits 5th on Artificial Analysis; 27B/35B still pending

`Grok Build 0.1` lands on `Vercel AI Gateway`