`Qwen3.6 27B` pure `Q4_K_M` GGUF fits in 16GB VRAM

Pure quantization trims enough size to keep the whole model on a consumer GPU. Useful for local agent tests, but quality loss is real and benchmark depth is thin.

[ KEY POINTS ]

Q4_K_M MTP is 15.4GB and non-MTP is 15.1GB; comparable builds listed at 16.5-18GB often spill past 16GB cards.
MTP reaches 40 tok/s generation but only 195 tok/s prompt processing; non-MTP flips the trade-off at 715 tok/s pp and 24 tok/s tg.
Perplexity delta is larger than Unsloth's quant: +0.1707 vs +0.0553 on MTP, so the size win buys speed/fit at some quality cost.

Originalwww.reddit.com/r/LocalLLaMA/comments/1tkzk9e/qwen36_27b_pure_quant_40_toks_on_16_gb_vram/Read original →

// related

#0001
#0001Other GeekNewsyesterday
Use HTML `<dl>` for Name-Value UI Patterns
40radar
<dl> is the semantic HTML element for name-value pairs. It fits amenities, invoices, specs, and glossary UIs, so it is a small but useful accessibility and markup habit.
- <dl> represents a list of name-value pairs, not just a dictionary-style glossary.
- Good fits include amenities, billing line items, technical terms, and product spec rows where labels and values repeat.
- Using <dt> for names and <dd> for values keeps UI markup semantic without adding custom div-heavy structure.
Source: news.hada.io/topic?id=29821Read original →
FIG-0011:1
40radar
FIG-0011:1
#0002
#0002Other r/MachineLearning2 days ago
Vision LLMs vs. OCR for PDF Q&A: OCR Still Wins on Cost and Accuracy
60radar
For complex PDF Q&A, vision LLMs are pricier and less accurate than OCR pipelines. Stick with OCR for better cost-performance and reliability.
- The native vision LLM approach was the most expensive at $0.2552/query and ranked 5th in accuracy (52.0%) out of six methods tested.
- Vision models underperformed on chart- and table-heavy pages, the very area they were expected to excel in. Premium OCR handled these better.
- The vision LLM had a 7% intrinsic failure rate that persisted after retries, while OCR-based pipelines showed 0% failure, indicating higher reliability.
Source: www.reddit.com/r/MachineLearning/comments/1tm0cqg/visionRead original →
60radar
PHOTO
FIG-0021:1
#0003
#0003Other GeekNews2 days ago
`Electrobun 2.0` to Split from `Bun` After Rust Rewrite
40radar
ElectrobunDesktop app framework — packages native apps with web tech
Desktop app runtime dependencies are changing, with less reliance on Bun. Useful only if you are evaluating Electron alternatives; otherwise low urgency.
- Electrobun 2.0 is moving toward a Rust rewrite and reduced Bun dependency, changing its runtime structure.
- The split was influenced by concerns that Anthropic lacks enough human review, staged rollout, and stabilization process.
- This is a niche signal for desktop app stacks: wait for migration notes before building production workflows on it.
Source: news.hada.io/topic?id=29815Read original →
FIG-0031:1
40radar
FIG-0031:1

`Qwen3.6 27B` pure `Q4_K_M` GGUF fits in **16GB VRAM**

// related

Use HTML `<dl>` for Name-Value UI Patterns

Vision LLMs vs. OCR for PDF Q&A: OCR Still Wins on Cost and Accuracy

`Electrobun 2.0` to Split from `Bun` After Rust Rewrite

`Qwen3.6 27B` pure `Q4_K_M` GGUF fits in 16GB VRAM