iMaC: Translating Actions into Motion and Contact Images for Embodied World Models Paper • 2606.09813 • Published 17 days ago • 13
DRIFT: Decoupled Rollouts and Importance-Weighted Fine-Tuning for Efficient Multi-Turn Optimization Paper • 2605.31455 • Published 27 days ago • 6
Code-Guided Reasoning for Small Language Models: Evaluating Executable MCQA Scaffolds Paper • 2605.18827 • Published May 12 • 7
Learning to Foresee: Unveiling the Unlocking Efficiency of On-Policy Distillation Paper • 2605.11739 • Published May 13 • 59
Missing Old Logits in Asynchronous Agentic RL: Semantic Mismatch and Repair Methods for Off-Policy Correction Paper • 2605.12070 • Published May 12 • 17
Shaping Schema via Language Representation as the Next Frontier for LLM Intelligence Expanding Paper • 2605.09271 • Published May 10 • 8
DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off Paper • 2604.13902 • Published Apr 15 • 62
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published Apr 8 • 328
When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models Paper • 2604.08546 • Published Apr 9 • 116