view post Post 19 Good news, llama.cpp seems to be close to supporting MTP on qwen models. Bad news, every single gguf will have to be redone when it is. See translation
view post Post 665 I'll never understand why people are merging reasoning models with non-reasoning models. It's worse, every time. You got to train reasoning on reasoning data, and merge reasoning, with reasoning. See translation
Nemotron nvidia/Llama-3_3-Nemotron-Super-49B-v1 Text Generation • 50B • Updated Oct 15, 2025 • 38.6k • 323 nvidia/Llama-3.1-Nemotron-Nano-4B-v1.1 Text Generation • 5B • Updated Oct 15, 2025 • 5.09k • 114 TheDrummer/Valkyrie-49B-v1 Updated May 25, 2025 • 40 • 58
Best for RP on mobile dGPU Models without twee romantic language, absurd bad erotica cliches or low coherence. These models are top of their weight class. RichardErkhov/shadowml_-_DareBeagel-2x7B-gguf Updated Aug 31, 2024 • 471 mradermacher/IceLemonTeaRP-32k-7b-GGUF 7B • Updated May 6, 2024 • 101 • 12 bartowski/L3-8B-Lunaris-v1-GGUF Text Generation • Updated Jun 29, 2024 • 2.95k • 26
Nemotron nvidia/Llama-3_3-Nemotron-Super-49B-v1 Text Generation • 50B • Updated Oct 15, 2025 • 38.6k • 323 nvidia/Llama-3.1-Nemotron-Nano-4B-v1.1 Text Generation • 5B • Updated Oct 15, 2025 • 5.09k • 114 TheDrummer/Valkyrie-49B-v1 Updated May 25, 2025 • 40 • 58
Best for RP on mobile dGPU Models without twee romantic language, absurd bad erotica cliches or low coherence. These models are top of their weight class. RichardErkhov/shadowml_-_DareBeagel-2x7B-gguf Updated Aug 31, 2024 • 471 mradermacher/IceLemonTeaRP-32k-7b-GGUF 7B • Updated May 6, 2024 • 101 • 12 bartowski/L3-8B-Lunaris-v1-GGUF Text Generation • Updated Jun 29, 2024 • 2.95k • 26