NVFP4 GGUF?

by andrew-stanton - opened 18 days ago

My understanding is that nvidia trained it end to end on NVFP4, similar to how GPT-OSS-120b/20b did for MXFP4. I looked at the MXFP4_MOE quants you provided and it appears majority of the tensors are actually in F32 and q8. Any plans to release of the natively trained NVFP4 model in GGUF?

TheUnsavory

18 days ago

It looks like llama.cpp support for NVFP4 was merged today?

https://github.com/ggml-org/llama.cpp/pull/19769

danielhanchen

Unsloth AI org 18 days ago

It looks like llama.cpp support for NVFP4 was merged today?

https://github.com/ggml-org/llama.cpp/pull/19769

We'll see what we can do. Llama.cpp team always cooking

jpsequeira

8 days ago

Any update on this?
Even for models that were not trained natively in NVFP4 would be of great use in this format for blackwell users.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment