Picture for Bo Zhao

Bo Zhao

Omni6DPose: A Benchmark and Model for Universal 6D Object Pose Estimation and Tracking

Jun 06, 2024
Viaarxiv icon

VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval

Add code
Jun 06, 2024
Viaarxiv icon

MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding

Add code
Jun 06, 2024
Viaarxiv icon

The SkatingVerse Workshop & Challenge: Methods and Results

Add code
May 27, 2024
Viaarxiv icon

VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding

Add code
May 22, 2024
Viaarxiv icon

Efficient Multimodal Large Language Models: A Survey

Add code
May 17, 2024
Viaarxiv icon

Understanding the Difficulty of Solving Cauchy Problems with PINNs

May 04, 2024
Viaarxiv icon

Advances and Open Challenges in Federated Learning with Foundation Models

Apr 29, 2024
Viaarxiv icon

Tele-FLM Technical Report

Add code
Apr 25, 2024
Figure 1 for Tele-FLM Technical Report
Figure 2 for Tele-FLM Technical Report
Figure 3 for Tele-FLM Technical Report
Figure 4 for Tele-FLM Technical Report
Viaarxiv icon

M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models

Add code
Mar 31, 2024
Viaarxiv icon