`gemma-4-31B-it-DFlash` released, but blocked on `llama.cpp` support
Weights are up on Hugging Face, but local testing is still blocked by unmerged llama.cpp PR #22105. Useful only for tracking right now; wait for merge before judging real usability.
- The model is already published at
huggingface.co/z-lab/gemma-4-31B-it-DFlash, so distribution started before runtime support landed. - Testing is gated by
ggml-org/llama.cppPR#22105; without that merge, local inference flow is effectively blocked. - This is a release you bookmark, not deploy today. The next real checkpoint is PR merge, then compatibility and performance checks.