FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios Paper • 2604.07413 • Published 6 days ago • 75
Communicating about Space: Language-Mediated Spatial Integration Across Partial Views Paper • 2603.27183 • Published 16 days ago • 20
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published 19 days ago • 96
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published 19 days ago • 96 • 5
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published 19 days ago • 96
EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings Paper • 2603.13594 • Published about 1 month ago • 148
Value Drifts: Tracing Value Alignment During LLM Post-Training Paper • 2510.26707 • Published Oct 30, 2025 • 13
LatentLens: Revealing Highly Interpretable Visual Tokens in LLMs Paper • 2602.00462 • Published Jan 31 • 19
EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings Paper • 2603.13594 • Published about 1 month ago • 148
EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings Paper • 2603.13594 • Published about 1 month ago • 148
LatentLens: Revealing Highly Interpretable Visual Tokens in LLMs Paper • 2602.00462 • Published Jan 31 • 19