SlopCodeBench: Benchmarking How Coding Agents Degrade Over Long-Horizon Iterative Tasks Paper • 2603.24755 • Published 5 days ago • 25
Running on CPU Upgrade Featured 67 Cohere Multilingual ASR 🎙 67 Transcribe audio clips to text in many languages
view article Article How I contributed a new model to the Transformers library using Codex about 12 hours ago • 15
RealChart2Code: Advancing Chart-to-Code Generation with Real Data and Multi-Task Evaluation Paper • 2603.25804 • Published 4 days ago • 19
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis Paper • 2603.20278 • Published 13 days ago • 91
On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation Paper • 2603.22117 • Published 8 days ago • 27
Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills Paper • 2603.25158 • Published 5 days ago • 34
TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate Paper • 2504.19874 • Published Apr 28, 2025 • 23