A Unified Multimodal Data Quality Classifier for generating quality scores for both image-text caption data and interleaved document data
Weizhi Wang
weizhiwang
AI & ML interests
None yet
Organizations
models 12
weizhiwang/UniFilter-Qwen3-0.6B
Image-Text-to-Text • 1B • Updated • 2
weizhiwang/UniFilter-Qwen2.5-1.5B
Image-Text-to-Text • 2B • Updated • 4
weizhiwang/Open-Qwen2VL
Image-Text-to-Text • Updated • 37 • 21
weizhiwang/mlm-filter-qwen2.5-1.5b-gpt4o
Text Generation • 2B • Updated • 5 • 3
weizhiwang/Open-Qwen2VL-base
Image-Text-to-Text • Updated • 10
weizhiwang/unifilter_mllm_pretrain_checkpoints
Updated
weizhiwang/unifilter_mllm_sft_checkpoints
Updated
weizhiwang/LLaVA-Video-Llama-3.1-8B
8B • Updated • 57 • 5
weizhiwang/llava-video-llama-3.1-8b-siglip-so-384-aapool-144-projector
Updated • 3
weizhiwang/mlm-filter-llava-13b-gpt4v
Text Generation • Updated • 12 • 6
datasets 11
weizhiwang/unifilter_train_data
Updated • 69
weizhiwang/OBELICS_HQ_5M_UniFilter
Viewer • Updated • 5.06M • 858
weizhiwang/cnsi-chatbot
Updated • 53
weizhiwang/mlm_filter_instructions
Updated • 21 • 5
weizhiwang/agent_eval
Viewer • Updated • 851 • 90
weizhiwang/Open-Qwen2VL-Data
Viewer • Updated • 13M • 3.17k • 24
weizhiwang/Open-Qwen2VL-Data-Interleaved
Viewer • Updated • 23.3M • 1.09k • 3
weizhiwang/mmc4_fewer_faces
Updated • 6
weizhiwang/datacomp-hq
Updated • 81
weizhiwang/llava_v15_instruction_images
Preview • Updated • 151 • 6