Alert button
Picture for Difei Gao

Difei Gao

Alert button

Delocate: Detection and Localization for Deepfake Videos with Randomly-Located Tampered Traces

Add code
Bookmark button
Alert button
Jan 24, 2024
Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Qian Wang, Zheng Qin, Mike Zheng Shou

Viaarxiv icon

ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation

Add code
Bookmark button
Alert button
Jan 01, 2024
Difei Gao, Lei Ji, Zechen Bai, Mingyu Ouyang, Peiran Li, Dongxing Mao, Qinchen Wu, Weichen Zhang, Peiyi Wang, Xiangwu Guo, Hengxu Wang, Luowei Zhou, Mike Zheng Shou

Viaarxiv icon

ViT-Lens-2: Gateway to Omni-modal Intelligence

Add code
Bookmark button
Alert button
Nov 27, 2023
Weixian Lei, Yixiao Ge, Kun Yi, Jianfeng Zhang, Difei Gao, Dylan Sun, Yuying Ge, Ying Shan, Mike Zheng Shou

Viaarxiv icon

CVPR 2023 Text Guided Video Editing Competition

Add code
Bookmark button
Alert button
Oct 24, 2023
Jay Zhangjie Wu, Xiuyu Li, Difei Gao, Zhen Dong, Jinbin Bai, Aishani Singh, Xiaoyu Xiang, Youzeng Li, Zuwei Huang, Yuanxi Sun, Rui He, Feng Hu, Junhua Hu, Hai Huang, Hanyu Zhu, Xu Cheng, Jie Tang, Mike Zheng Shou, Kurt Keutzer, Forrest Iandola

Viaarxiv icon

Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation

Add code
Bookmark button
Alert button
Sep 27, 2023
David Junhao Zhang, Jay Zhangjie Wu, Jia-Wei Liu, Rui Zhao, Lingmin Ran, Yuchao Gu, Difei Gao, Mike Zheng Shou

Figure 1 for Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
Figure 2 for Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
Figure 3 for Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
Figure 4 for Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
Viaarxiv icon

Recap: Detecting Deepfake Video with Unpredictable Tampered Traces via Recovering Faces and Mapping Recovered Faces

Add code
Bookmark button
Alert button
Aug 19, 2023
Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Qian Wang, Zheng Qin, Mike Zheng Shou

Figure 1 for Recap: Detecting Deepfake Video with Unpredictable Tampered Traces via Recovering Faces and Mapping Recovered Faces
Figure 2 for Recap: Detecting Deepfake Video with Unpredictable Tampered Traces via Recovering Faces and Mapping Recovered Faces
Figure 3 for Recap: Detecting Deepfake Video with Unpredictable Tampered Traces via Recovering Faces and Mapping Recovered Faces
Figure 4 for Recap: Detecting Deepfake Video with Unpredictable Tampered Traces via Recovering Faces and Mapping Recovered Faces
Viaarxiv icon

UniVTG: Towards Unified Video-Language Temporal Grounding

Add code
Bookmark button
Alert button
Aug 18, 2023
Kevin Qinghong Lin, Pengchuan Zhang, Joya Chen, Shraman Pramanick, Difei Gao, Alex Jinpeng Wang, Rui Yan, Mike Zheng Shou

Figure 1 for UniVTG: Towards Unified Video-Language Temporal Grounding
Figure 2 for UniVTG: Towards Unified Video-Language Temporal Grounding
Figure 3 for UniVTG: Towards Unified Video-Language Temporal Grounding
Figure 4 for UniVTG: Towards Unified Video-Language Temporal Grounding
Viaarxiv icon

AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn

Add code
Bookmark button
Alert button
Jun 28, 2023
Difei Gao, Lei Ji, Luowei Zhou, Kevin Qinghong Lin, Joya Chen, Zihan Fan, Mike Zheng Shou

Figure 1 for AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn
Figure 2 for AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn
Figure 3 for AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn
Figure 4 for AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn
Viaarxiv icon

GroundNLQ @ Ego4D Natural Language Queries Challenge 2023

Add code
Bookmark button
Alert button
Jun 27, 2023
Zhijian Hou, Lei Ji, Difei Gao, Wanjun Zhong, Kun Yan, Chao Li, Wing-Kwong Chan, Chong-Wah Ngo, Nan Duan, Mike Zheng Shou

Figure 1 for GroundNLQ @ Ego4D Natural Language Queries Challenge 2023
Figure 2 for GroundNLQ @ Ego4D Natural Language Queries Challenge 2023
Figure 3 for GroundNLQ @ Ego4D Natural Language Queries Challenge 2023
Figure 4 for GroundNLQ @ Ego4D Natural Language Queries Challenge 2023
Viaarxiv icon

Affordance Grounding from Demonstration Video to Target Image

Add code
Bookmark button
Alert button
Mar 26, 2023
Joya Chen, Difei Gao, Kevin Qinghong Lin, Mike Zheng Shou

Figure 1 for Affordance Grounding from Demonstration Video to Target Image
Figure 2 for Affordance Grounding from Demonstration Video to Target Image
Figure 3 for Affordance Grounding from Demonstration Video to Target Image
Figure 4 for Affordance Grounding from Demonstration Video to Target Image
Viaarxiv icon