Papers - a antr0x Collection

antr0x 's Collections

Papers

updated about 4 hours ago

WorldVLA: Towards Autoregressive Action World Model

Paper • 2506.21539 • Published Jun 26, 2025 • 40
Fast and Simplex: 2-Simplicial Attention in Triton

Paper • 2507.02754 • Published Jul 3, 2025 • 25
IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction

Paper • 2507.02025 • Published Jul 2, 2025 • 36
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact

Paper • 2507.00951 • Published Jul 1, 2025 • 24
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Paper • 2507.01006 • Published Jul 1, 2025 • 257
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published Jul 1, 2025 • 79
CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization

Paper • 2507.06181 • Published Jul 8, 2025 • 45
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19, 2025 • 137
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

Paper • 2508.07407 • Published Aug 10, 2025 • 99
MoBE: Mixture-of-Basis-Experts for Compressing MoE-based LLMs

Paper • 2508.05257 • Published Aug 7, 2025 • 13
Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts

Paper • 2508.07785 • Published Aug 11, 2025 • 30
rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published Aug 28, 2025 • 120
Think in Games: Learning to Reason in Games via Reinforcement Learning with Large Language Models

Paper • 2508.21365 • Published Aug 29, 2025 • 29
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 517
Diffusion Transformers with Representation Autoencoders

Paper • 2510.11690 • Published Oct 13, 2025 • 171
Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 276
Demystifying Reinforcement Learning in Agentic Reasoning

Paper • 2510.11701 • Published Oct 13, 2025 • 33
ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning

Paper • 2510.12693 • Published Oct 14, 2025 • 28
Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn LLM Agents

Paper • 2510.14967 • Published Oct 16, 2025 • 34
Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model

Paper • 2510.18855 • Published Oct 21, 2025 • 73
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts

Paper • 2510.19363 • Published Oct 22, 2025 • 63
MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning

Paper • 2511.06805 • Published Nov 10, 2025 • 13
DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs

Paper • 2601.03559 • Published Jan 7 • 14
Self-Hinting Language Models Enhance Reinforcement Learning

Paper • 2602.03143 • Published Feb 3 • 31
RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System

Paper • 2602.02488 • Published Feb 2 • 36
WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

Paper • 2602.04634 • Published Feb 4 • 100
Memory Intelligence Agent

Paper • 2604.04503 • Published Apr 6 • 58
Solvita: Enhancing Large Language Models for Competitive Programming via Agentic Evolution

Paper • 2605.15301 • Published May 14 • 22
Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning

Paper • 2606.11087 • Published 25 days ago • 3
Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Paper • 2606.15007 • Published 22 days ago • 16
Reinforcement Learning with Metacognitive Feedback Elicits Faithful Uncertainty Expression in LLMs

Paper • 2606.32032 • Published 4 days ago • 21
Morphing into Hybrid Attention Models

Paper • 2606.30562 • Published 5 days ago • 33