Picture for Yixiao Ge

Yixiao Ge

Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots

Add code
May 13, 2024
Viaarxiv icon

SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing

Add code
May 07, 2024
Figure 1 for SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing
Figure 2 for SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing
Figure 3 for SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing
Figure 4 for SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing
Viaarxiv icon

SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension

Add code
Apr 25, 2024
Figure 1 for SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension
Figure 2 for SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension
Figure 3 for SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension
Figure 4 for SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension
Viaarxiv icon

SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation

Add code
Apr 22, 2024
Viaarxiv icon

ST-LLM: Large Language Models Are Effective Temporal Learners

Add code
Mar 30, 2024
Viaarxiv icon

YOLO-World: Real-Time Open-Vocabulary Object Detection

Add code
Feb 02, 2024
Viaarxiv icon

Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities

Add code
Jan 25, 2024
Viaarxiv icon

Supervised Fine-tuning in turn Improves Visual Foundation Models

Add code
Jan 18, 2024
Viaarxiv icon

Towards A Better Metric for Text-to-Video Generation

Add code
Jan 15, 2024
Figure 1 for Towards A Better Metric for Text-to-Video Generation
Figure 2 for Towards A Better Metric for Text-to-Video Generation
Figure 3 for Towards A Better Metric for Text-to-Video Generation
Figure 4 for Towards A Better Metric for Text-to-Video Generation
Viaarxiv icon

LLaMA Pro: Progressive LLaMA with Block Expansion

Add code
Jan 04, 2024
Viaarxiv icon