From Context to Skills: Can Language Models Learn from Context Skillfully? Paper β’ 2604.27660 β’ Published 7 days ago β’ 145
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond Paper β’ 2604.22748 β’ Published 16 days ago β’ 224
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper β’ 2602.05400 β’ Published Feb 5 β’ 353
view article Article Community Evals: Because we're done trusting black-box leaderboards over the community +5 Feb 4 β’ 89
view article Article Introducing Daggr: Chain apps programmatically, inspect visually +3 Jan 29 β’ 107
Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs Paper β’ 2601.17058 β’ Published Jan 22 β’ 190
SWE-Pruner: Self-Adaptive Context Pruning for Coding Agents Paper β’ 2601.16746 β’ Published Jan 23 β’ 91
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning Paper β’ 2601.09667 β’ Published Jan 14 β’ 92
Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper β’ 2601.06943 β’ Published Jan 11 β’ 214
DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle Paper β’ 2512.04324 β’ Published Dec 3, 2025 β’ 159