Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding Paper • 2604.05015 • Published Apr 6 • 235
A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning Paper • 2510.15444 • Published Oct 17, 2025 • 151
ACES: Who Tests the Tests? Leave-One-Out AUC Consistency for Code Generation Paper • 2604.03922 • Published Apr 5 • 53
ACES: Who Tests the Tests? Leave-One-Out AUC Consistency for Code Generation Paper • 2604.03922 • Published Apr 5 • 53
OneTrans: Unified Feature Interaction and Sequence Modeling with One Transformer in Industrial Recommender Paper • 2510.26104 • Published Oct 30, 2025
ACES: Who Tests the Tests? Leave-One-Out AUC Consistency for Code Generation Paper • 2604.03922 • Published Apr 5 • 53