Alert button
Picture for Wenyi Yu

Wenyi Yu

Alert button

M$^3$AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset

Add code
Bookmark button
Alert button
Mar 21, 2024
Zhe Chen, Heyang Liu, Wenyi Yu, Guangzhi Sun, Hongcheng Liu, Ji Wu, Chao Zhang, Yu Wang, Yanfeng Wang

Figure 1 for M$^3$AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset
Figure 2 for M$^3$AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset
Figure 3 for M$^3$AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset
Figure 4 for M$^3$AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset
Viaarxiv icon

SALMONN: Towards Generic Hearing Abilities for Large Language Models

Add code
Bookmark button
Alert button
Oct 20, 2023
Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Viaarxiv icon

Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models

Add code
Bookmark button
Alert button
Oct 10, 2023
Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Viaarxiv icon

Connecting Speech Encoder and Large Language Model for ASR

Add code
Bookmark button
Alert button
Sep 26, 2023
Wenyi Yu, Changli Tang, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Viaarxiv icon