Picture for Yuan Gao

Yuan Gao

VideoTetris: Towards Compositional Text-to-Video Generation

Add code
Jun 06, 2024
Viaarxiv icon

Near Optimal Decentralized Optimization with Compression and Momentum Tracking

Add code
May 30, 2024
Viaarxiv icon

ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models

Add code
May 24, 2024
Viaarxiv icon

Aux-NAS: Exploiting Auxiliary Labels with Negligibly Extra Inference Cost

Add code
May 09, 2024
Viaarxiv icon

Dual Relation Mining Network for Zero-Shot Learning

May 06, 2024
Viaarxiv icon

Vision-Language Model-based Physical Reasoning for Robot Liquid Perception

Apr 10, 2024
Viaarxiv icon

Anchor-based Robust Finetuning of Vision-Language Models

Apr 09, 2024
Figure 1 for Anchor-based Robust Finetuning of Vision-Language Models
Figure 2 for Anchor-based Robust Finetuning of Vision-Language Models
Figure 3 for Anchor-based Robust Finetuning of Vision-Language Models
Figure 4 for Anchor-based Robust Finetuning of Vision-Language Models
Viaarxiv icon

Convergence of Continuous Normalizing Flows for Learning Probability Distributions

Mar 31, 2024
Viaarxiv icon

Instance-Adaptive and Geometric-Aware Keypoint Learning for Category-Level 6D Object Pose Estimation

Add code
Mar 28, 2024
Figure 1 for Instance-Adaptive and Geometric-Aware Keypoint Learning for Category-Level 6D Object Pose Estimation
Figure 2 for Instance-Adaptive and Geometric-Aware Keypoint Learning for Category-Level 6D Object Pose Estimation
Figure 3 for Instance-Adaptive and Geometric-Aware Keypoint Learning for Category-Level 6D Object Pose Estimation
Figure 4 for Instance-Adaptive and Geometric-Aware Keypoint Learning for Category-Level 6D Object Pose Estimation
Viaarxiv icon

MEDBind: Unifying Language and Multimodal Medical Data Embeddings

Add code
Mar 20, 2024
Figure 1 for MEDBind: Unifying Language and Multimodal Medical Data Embeddings
Figure 2 for MEDBind: Unifying Language and Multimodal Medical Data Embeddings
Figure 3 for MEDBind: Unifying Language and Multimodal Medical Data Embeddings
Figure 4 for MEDBind: Unifying Language and Multimodal Medical Data Embeddings
Viaarxiv icon