-
Qiuchen-Wang/Qwen2.5-VL-7B-VRAG
Image-Text-to-Text • 8B • Updated • 1.82k • 8 -
ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents
Paper • 2502.18017 • Published • 21 -
VimRAG: Navigating Massive Visual Context in Retrieval-Augmented Generation via Multimodal Memory Graph
Paper • 2602.12735 • Published • 4 -
VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning
Paper • 2505.22019 • Published • 11
Collections
Discover the best community collections!
Collections including paper arxiv:2502.18017
-
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 34 -
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Paper • 2409.11406 • Published • 27 -
Diffusion Models Are Real-Time Game Engines
Paper • 2408.14837 • Published • 126 -
Segment Anything with Multiple Modalities
Paper • 2408.09085 • Published • 22
-
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
Paper • 2402.15506 • Published • 17 -
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent
Paper • 2404.03648 • Published • 29 -
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts
Paper • 2405.19893 • Published • 33 -
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable
Paper • 2405.19888 • Published • 7
-
From RAG to Memory: Non-Parametric Continual Learning for Large Language Models
Paper • 2502.14802 • Published • 13 -
A Survey of Graph Retrieval-Augmented Generation for Customized Large Language Models
Paper • 2501.13958 • Published • 1 -
RAGAR, Your Falsehood RADAR: RAG-Augmented Reasoning for Political Fact-Checking using Multimodal Large Language Models
Paper • 2404.12065 • Published • 3 -
A Survey on Retrieval-Augmented Text Generation for Large Language Models
Paper • 2404.10981 • Published • 1
-
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 55 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 82 -
Table-GPT: Table-tuned GPT for Diverse Table Tasks
Paper • 2310.09263 • Published • 40 -
Context-Aware Meta-Learning
Paper • 2310.10971 • Published • 17
-
Specialized Language Models with Cheap Inference from Limited Domain Data
Paper • 2402.01093 • Published • 47 -
Attention Heads of Large Language Models: A Survey
Paper • 2409.03752 • Published • 92 -
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Paper • 2409.01704 • Published • 83 -
jina-embeddings-v3: Multilingual Embeddings With Task LoRA
Paper • 2409.10173 • Published • 34
-
Qiuchen-Wang/Qwen2.5-VL-7B-VRAG
Image-Text-to-Text • 8B • Updated • 1.82k • 8 -
ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents
Paper • 2502.18017 • Published • 21 -
VimRAG: Navigating Massive Visual Context in Retrieval-Augmented Generation via Multimodal Memory Graph
Paper • 2602.12735 • Published • 4 -
VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning
Paper • 2505.22019 • Published • 11
-
From RAG to Memory: Non-Parametric Continual Learning for Large Language Models
Paper • 2502.14802 • Published • 13 -
A Survey of Graph Retrieval-Augmented Generation for Customized Large Language Models
Paper • 2501.13958 • Published • 1 -
RAGAR, Your Falsehood RADAR: RAG-Augmented Reasoning for Political Fact-Checking using Multimodal Large Language Models
Paper • 2404.12065 • Published • 3 -
A Survey on Retrieval-Augmented Text Generation for Large Language Models
Paper • 2404.10981 • Published • 1
-
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 34 -
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Paper • 2409.11406 • Published • 27 -
Diffusion Models Are Real-Time Game Engines
Paper • 2408.14837 • Published • 126 -
Segment Anything with Multiple Modalities
Paper • 2408.09085 • Published • 22
-
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 55 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 82 -
Table-GPT: Table-tuned GPT for Diverse Table Tasks
Paper • 2310.09263 • Published • 40 -
Context-Aware Meta-Learning
Paper • 2310.10971 • Published • 17
-
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
Paper • 2402.15506 • Published • 17 -
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent
Paper • 2404.03648 • Published • 29 -
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts
Paper • 2405.19893 • Published • 33 -
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable
Paper • 2405.19888 • Published • 7
-
Specialized Language Models with Cheap Inference from Limited Domain Data
Paper • 2402.01093 • Published • 47 -
Attention Heads of Large Language Models: A Survey
Paper • 2409.03752 • Published • 92 -
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Paper • 2409.01704 • Published • 83 -
jina-embeddings-v3: Multilingual Embeddings With Task LoRA
Paper • 2409.10173 • Published • 34