Language Technology Lab at Alibaba DAMO Academy

company

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

veggiebird authored a paper 18 days ago

ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?

veggiebird authored a paper 18 days ago

Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework

veggiebird authored a paper 18 days ago

Retrieving Multimodal Information for Augmented Generation: A Survey

View all activity

authored 10 papers 18 days ago

ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?

Paper • 2311.16989 • Published Nov 28, 2023

Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework

Paper • 2305.03268 • Published May 5, 2023 • 3

Retrieving Multimodal Information for Augmented Generation: A Survey

Paper • 2303.10868 • Published Mar 20, 2023

How Much are LLMs Contaminated? A Comprehensive Survey and the LLMSanitize Library

Paper • 2404.00699 • Published Mar 31, 2024

Can We Further Elicit Reasoning in LLMs? Critic-Guided Planning with Retrieval-Augmentation for Solving Challenging Tasks

Paper • 2410.01428 • Published Oct 2, 2024 • 1

MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs

Paper • 2504.00993 • Published Apr 1, 2025 • 3

Multi-Agent Tool-Integrated Policy Optimization

Paper • 2510.04678 • Published Oct 6, 2025 • 31

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published Nov 20, 2025 • 96

LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling

Paper • 2511.20785 • Published Nov 25, 2025 • 189

MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification

Paper • 2603.15726 • Published 21 days ago • 184

authored a paper 27 days ago

MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier

Paper • 2603.03756 • Published Mar 4 • 89

authored a paper 28 days ago

Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders

Paper • 2603.06569 • Published about 1 month ago • 118

authored 7 papers about 1 month ago

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Paper • 2501.13106 • Published Jan 22, 2025 • 91

What Is a Good Caption? A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness

Paper • 2502.14914 • Published Feb 19, 2025

MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources

Paper • 2509.21268 • Published Sep 25, 2025 • 104

N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models

Paper • 2512.16561 • Published Dec 18, 2025 • 20

Chain of Ideas: Revolutionizing Research in Novel Idea Development with LLM Agents

Paper • 2410.13185 • Published Oct 17, 2024 • 5

Focus on the Whole Character: Discriminative Character Modeling for Scene Text Recognition

Paper • 2407.05562 • Published Jul 8, 2024

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Paper • 2501.00599 • Published Dec 31, 2024 • 46

authored a paper about 2 months ago

RynnBrain: Open Embodied Foundation Models

Paper • 2602.14979 • Published Feb 13 • 45