Qwen-AgentWorld: Language World Models for General Agents Paper • 2606.24597 • Published 4 days ago • 121
AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction Paper • 2606.23449 • Published 5 days ago • 28
Go-with-the-Track: Video Compositing and Motion Control with Point Tracking Paper • 2606.20891 • Published 9 days ago • 3
DataClaw0: Agentic Tailoring Multimodal Data from Raw Streams Paper • 2606.21337 • Published 8 days ago • 70
Grouped Query Experts: Mixture-of-Experts on GQA Self-Attention Paper • 2606.20945 • Published 9 days ago • 64
PAIWorld: A 3D-Consistent World Foundation Model for Robotic Manipulation Paper • 2606.18375 • Published 11 days ago • 11
MaineCoon: Pursuing A Real-Time Audio-Visual Social World Model Paper • 2606.17800 • Published 11 days ago • 13
OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data Paper • 2606.13432 • Published 16 days ago • 109
Avatar V: Scaling Video-Reference Avatar Video Generation Paper • 2606.13872 • Published 16 days ago • 9
Qwen-RobotWorld Technical Report: Unifying Embodied World Modeling through Language-Conditioned Video Generation Paper • 2606.17030 • Published 12 days ago • 30
PermaVid: Consistent Video Generation Across Edits via Disentangled Context Memory Paper • 2606.16449 • Published 12 days ago • 5
Beyond Monolingual Deep Research: Evaluating Agents and Retrievers with Cross-Lingual BrowseComp-Plus Paper • 2606.15345 • Published 14 days ago • 16
ActWorld: From Explorable to Interactive World Model via Action-Aware Memory Paper • 2606.17730 • Published 11 days ago • 8
iMaC: Translating Actions into Motion and Contact Images for Embodied World Models Paper • 2606.09813 • Published 19 days ago • 13