view article Article Mixture of Experts (MoEs) in Transformers +5 ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap • Feb 26 • 164
view article Article From Golden Gate Bridge to Broken JSON: Why Anthropic's SAE Steering Fails for Structured Output MaziyarPanahi • Feb 7 • 22
view article Article Community Evals: Because we're done trusting black-box leaderboards over the community +5 burtenshaw, SaylorTwift, kramp, merve, davanstrien, nielsr, julien-c • Feb 4 • 90
view article Article RexRerankers: SOTA Rankers for Product Discovery and AI Assistants thebajajra • Jan 24 • 44
view article Article How We Built a Semantic Highlight Model To Save Token Cost for RAG zilliz • Jan 15 • 67
view article Article Open Responses: What you need to know +2 evalstate, burtenshaw, merve, pcuenq • Jan 15 • 112
view article Article The Optimal Architecture for Small Language Models codelion • Dec 26, 2025 • 121
view article Article Continuous batching from first principles +1 ror, ArthurZ, mcpotato • Nov 25, 2025 • 396
view article Article Diffusers welcomes FLUX-2 +6 YiYiXu, dg845, sayakpaul, OzzyGT, dn6, ariG23498, linoyts, multimodalart • Nov 25, 2025 • 191
view article Article Introducing RTEB: A New Standard for Retrieval Evaluation +4 fzliu, KennethEnevoldsen, Samoed, isaacchung, tomaarsen, fzoll • Oct 1, 2025 • 144
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning Paper • 2509.08519 • Published Sep 10, 2025 • 130
Rethinking Large Language Model Distillation: A Constrained Markov Decision Process Perspective Paper • 2509.22921 • Published Sep 26, 2025 • 12
view article Article Gaia2 and ARE: Empowering the community to study agents +9 clefourrier, gregmialz, mlcu, mortimerp9, XciD, tfrere, evijit, RomainFroger, dheeraj7596, CarolinePascal, upiter • Sep 22, 2025 • 134
view article Article RexBERT: Encoders for a brave new world of E-Commerce thebajajra • Sep 20, 2025 • 50