Picture for Jiaqi Wang

Jiaqi Wang

CRAG -- Comprehensive RAG Benchmark

Jun 07, 2024
Viaarxiv icon

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

Add code
Jun 06, 2024
Viaarxiv icon

Bootstrap3D: Improving 3D Content Creation with Synthetic Data

Add code
May 31, 2024
Viaarxiv icon

Streaming Long Video Understanding with Large Language Models

May 25, 2024
Viaarxiv icon

ReasonPix2Pix: Instruction Reasoning Dataset for Advanced Image Editing

May 18, 2024
Viaarxiv icon

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

Add code
Apr 29, 2024
Figure 1 for How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
Figure 2 for How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
Figure 3 for How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
Figure 4 for How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
Viaarxiv icon

Make-it-Real: Unleashing Large Multimodal Model's Ability for Painting 3D Objects with Realistic Materials

Add code
Apr 29, 2024
Viaarxiv icon

LMFNet: An Efficient Multimodal Fusion Approach for Semantic Segmentation in High-Resolution Remote Sensing

Apr 21, 2024
Viaarxiv icon

SSVT: Self-Supervised Vision Transformer For Eye Disease Diagnosis Based On Fundus Images

Apr 20, 2024
Viaarxiv icon

Unified Scene Representation and Reconstruction for 3D Large Language Models

Add code
Apr 19, 2024
Viaarxiv icon