Sports Video Understanding Benchmarks
AI & ML interests
Computer Vision; Video Understanding; Action Recognition
Recent Activity
Papers
SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation
SAM 2++: Tracking Anything at Any Granularity
-
MCG-NJU/LongVPO-Stage1-InternVL3-8B
Video-Text-to-Text • 8B • Updated • 29 -
MCG-NJU/LongVPO-Stage2-InternVL3-8B
Video-Text-to-Text • 8B • Updated • 35 -
MCG-NJU/LongVPO-Training-Data
Viewer • Updated • 14.5k • 99 -
LongVPO: From Anchored Cues to Self-Reasoning for Long-Form Video Preference Optimization
Paper • 2602.02341 • Published • 1
-
MCG-NJU/SteadyDancer-14B
Image-to-Video • Updated • 473 • 69 -
SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation
Paper • 2511.19320 • Published • 43 -
MCG-NJU/X-Dance
Viewer • Updated • 36 • 411 • 18 -
MCG-NJU/SteadyDancer-GGUF
Image-to-Video • 16B • Updated • 661 • 24
VideoMAE Pre-trained Models
-
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Paper • 2203.12602 • Published • 3 -
MCG-NJU/videomae-base
Video Classification • 94.2M • Updated • 55.4k • 53 -
MCG-NJU/videomae-base-finetuned-kinetics
Video Classification • 86.5M • Updated • 98.5k • 48 -
MCG-NJU/videomae-base-finetuned-ssv2
Video Classification • Updated • 4.75k • 7
Video-o3: Native Interleaved Clue Seeking for Long Video Multi-Hop Reasoning
Learning Human Skill Generators at Key-Step Levels
CaReBench data, CaRe models and all the contrastively trained MLLMs (including InternVL2, MiniCPM-V 2.6, LLaVA NeXT Video, Qwen2-VL and Tariser).
Sports Video Understanding Benchmarks
VideoMAE Pre-trained Models
-
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Paper • 2203.12602 • Published • 3 -
MCG-NJU/videomae-base
Video Classification • 94.2M • Updated • 55.4k • 53 -
MCG-NJU/videomae-base-finetuned-kinetics
Video Classification • 86.5M • Updated • 98.5k • 48 -
MCG-NJU/videomae-base-finetuned-ssv2
Video Classification • Updated • 4.75k • 7
-
MCG-NJU/LongVPO-Stage1-InternVL3-8B
Video-Text-to-Text • 8B • Updated • 29 -
MCG-NJU/LongVPO-Stage2-InternVL3-8B
Video-Text-to-Text • 8B • Updated • 35 -
MCG-NJU/LongVPO-Training-Data
Viewer • Updated • 14.5k • 99 -
LongVPO: From Anchored Cues to Self-Reasoning for Long-Form Video Preference Optimization
Paper • 2602.02341 • Published • 1
Video-o3: Native Interleaved Clue Seeking for Long Video Multi-Hop Reasoning
-
MCG-NJU/SteadyDancer-14B
Image-to-Video • Updated • 473 • 69 -
SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation
Paper • 2511.19320 • Published • 43 -
MCG-NJU/X-Dance
Viewer • Updated • 36 • 411 • 18 -
MCG-NJU/SteadyDancer-GGUF
Image-to-Video • 16B • Updated • 661 • 24
Learning Human Skill Generators at Key-Step Levels
CaReBench data, CaRe models and all the contrastively trained MLLMs (including InternVL2, MiniCPM-V 2.6, LLaVA NeXT Video, Qwen2-VL and Tariser).