Alert button
Picture for Yongfei Liu

Yongfei Liu

Alert button

ViTAR: Vision Transformer with Any Resolution

Add code
Bookmark button
Alert button
Mar 28, 2024
Qihang Fan, Quanzeng You, Xiaotian Han, Yongfei Liu, Yunzhe Tao, Huaibo Huang, Ran He, Hongxia Yang

Figure 1 for ViTAR: Vision Transformer with Any Resolution
Figure 2 for ViTAR: Vision Transformer with Any Resolution
Figure 3 for ViTAR: Vision Transformer with Any Resolution
Figure 4 for ViTAR: Vision Transformer with Any Resolution
Viaarxiv icon

InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding

Add code
Bookmark button
Alert button
Mar 03, 2024
Haogeng Liu, Quanzeng You, Xiaotian Han, Yiqi Wang, Bohan Zhai, Yongfei Liu, Yunzhe Tao, Huaibo Huang, Ran He, Hongxia Yang

Figure 1 for InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding
Figure 2 for InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding
Figure 3 for InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding
Figure 4 for InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding
Viaarxiv icon

Exploring the Reasoning Abilities of Multimodal Large Language Models (MLLMs): A Comprehensive Survey on Emerging Trends in Multimodal Reasoning

Add code
Bookmark button
Alert button
Jan 18, 2024
Yiqi Wang, Wentao Chen, Xiaotian Han, Xudong Lin, Haiteng Zhao, Yongfei Liu, Bohan Zhai, Jianbo Yuan, Quanzeng You, Hongxia Yang

Viaarxiv icon

InfiMM-Eval: Complex Open-Ended Reasoning Evaluation For Multi-Modal Large Language Models

Add code
Bookmark button
Alert button
Dec 04, 2023
Xiaotian Han, Quanzeng You, Yongfei Liu, Wentao Chen, Huangjie Zheng, Khalil Mrini, Xudong Lin, Yiqi Wang, Bohan Zhai, Jianbo Yuan, Heng Wang, Hongxia Yang

Viaarxiv icon

Improving In-Context Learning in Diffusion Models with Visual Context-Modulated Prompts

Add code
Bookmark button
Alert button
Dec 03, 2023
Tianqi Chen, Yongfei Liu, Zhendong Wang, Jianbo Yuan, Quanzeng You, Hongxia Yang, Mingyuan Zhou

Viaarxiv icon

Reason out Your Layout: Evoking the Layout Master from Large Language Models for Text-to-Image Synthesis

Add code
Bookmark button
Alert button
Nov 28, 2023
Xiaohui Chen, Yongfei Liu, Yingxiang Yang, Jianbo Yuan, Quanzeng You, Li-Ping Liu, Hongxia Yang

Viaarxiv icon

Grounded Image Text Matching with Mismatched Relation Reasoning

Add code
Bookmark button
Alert button
Aug 04, 2023
Yu Wu, Yana Wei, Haozhe Wang, Yongfei Liu, Sibei Yang, Xuming He

Figure 1 for Grounded Image Text Matching with Mismatched Relation Reasoning
Figure 2 for Grounded Image Text Matching with Mismatched Relation Reasoning
Figure 3 for Grounded Image Text Matching with Mismatched Relation Reasoning
Figure 4 for Grounded Image Text Matching with Mismatched Relation Reasoning
Viaarxiv icon

HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models

Add code
Bookmark button
Alert button
Mar 29, 2023
Shan Ning, Longtian Qiu, Yongfei Liu, Xuming He

Figure 1 for HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models
Figure 2 for HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models
Figure 3 for HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models
Figure 4 for HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models
Viaarxiv icon

Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning

Add code
Bookmark button
Alert button
Mar 02, 2023
Bo Wan, Yongfei Liu, Desen Zhou, Tinne Tuytelaars, Xuming He

Figure 1 for Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning
Figure 2 for Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning
Figure 3 for Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning
Figure 4 for Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning
Viaarxiv icon

VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers

Add code
Bookmark button
Alert button
Mar 30, 2022
Estelle Aflalo, Meng Du, Shao-Yen Tseng, Yongfei Liu, Chenfei Wu, Nan Duan, Vasudev Lal

Figure 1 for VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers
Figure 2 for VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers
Figure 3 for VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers
Figure 4 for VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers
Viaarxiv icon