Picture for Mike Zheng Shou

Mike Zheng Shou

Multi-Modal Generative Embedding Model

Add code
May 29, 2024
Viaarxiv icon

LOVA3: Learning to Visual Question Answering, Asking and Assessment

Add code
May 23, 2024
Viaarxiv icon

Hallucination of Multimodal Large Language Models: A Survey

Add code
Apr 29, 2024
Viaarxiv icon

Learning Long-form Video Prior via Generative Pre-Training

Add code
Apr 24, 2024
Figure 1 for Learning Long-form Video Prior via Generative Pre-Training
Figure 2 for Learning Long-form Video Prior via Generative Pre-Training
Figure 3 for Learning Long-form Video Prior via Generative Pre-Training
Figure 4 for Learning Long-form Video Prior via Generative Pre-Training
Viaarxiv icon

RingID: Rethinking Tree-Ring Watermarking for Enhanced Multi-Key Identification

Add code
Apr 23, 2024
Viaarxiv icon

Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models

Add code
Apr 03, 2024
Viaarxiv icon

Diffusion-Driven Self-Supervised Learning for Shape Reconstruction and Pose Estimation

Add code
Mar 19, 2024
Figure 1 for Diffusion-Driven Self-Supervised Learning for Shape Reconstruction and Pose Estimation
Figure 2 for Diffusion-Driven Self-Supervised Learning for Shape Reconstruction and Pose Estimation
Figure 3 for Diffusion-Driven Self-Supervised Learning for Shape Reconstruction and Pose Estimation
Figure 4 for Diffusion-Driven Self-Supervised Learning for Shape Reconstruction and Pose Estimation
Viaarxiv icon

DragAnything: Motion Control for Anything using Entity Representation

Add code
Mar 15, 2024
Figure 1 for DragAnything: Motion Control for Anything using Entity Representation
Figure 2 for DragAnything: Motion Control for Anything using Entity Representation
Figure 3 for DragAnything: Motion Control for Anything using Entity Representation
Figure 4 for DragAnything: Motion Control for Anything using Entity Representation
Viaarxiv icon

Bring Your Own Character: A Holistic Solution for Automatic Facial Animation Generation of Customized Characters

Add code
Feb 21, 2024
Viaarxiv icon

Skip : A Simple Method to Reduce Hallucination in Large Vision-Language Models

Add code
Feb 12, 2024
Viaarxiv icon