Alert button
Picture for Tong Lu

Tong Lu

Alert button

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

Add code
Bookmark button
Alert button
Apr 29, 2024
Zhe Chen, Weiyun Wang, Hao Tian, Shenglong Ye, Zhangwei Gao, Erfei Cui, Wenwen Tong, Kongzhi Hu, Jiapeng Luo, Zheng Ma, Ji Ma, Jiaqi Wang, Xiaoyi Dong, Hang Yan, Hewei Guo, Conghui He, Botian Shi, Zhenjiang Jin, Chao Xu, Bin Wang, Xingjian Wei, Wei Li, Wenjian Zhang, Bo Zhang, Pinlong Cai, Licheng Wen, Xiangchao Yan, Min Dou, Lewei Lu, Xizhou Zhu, Tong Lu, Dahua Lin, Yu Qiao, Jifeng Dai, Wenhai Wang

Viaarxiv icon

Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding

Add code
Bookmark button
Alert button
Mar 14, 2024
Guo Chen, Yifei Huang, Jilan Xu, Baoqi Pei, Zhe Chen, Zhiqi Li, Jiahao Wang, Kunchang Li, Tong Lu, Limin Wang

Figure 1 for Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding
Figure 2 for Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding
Figure 3 for Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding
Figure 4 for Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding
Viaarxiv icon

Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures

Add code
Bookmark button
Alert button
Mar 07, 2024
Yuchen Duan, Weiyun Wang, Zhe Chen, Xizhou Zhu, Lewei Lu, Tong Lu, Yu Qiao, Hongsheng Li, Jifeng Dai, Wenhai Wang

Figure 1 for Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Figure 2 for Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Figure 3 for Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Figure 4 for Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Viaarxiv icon

PromptRR: Diffusion Models as Prompt Generators for Single Image Reflection Removal

Add code
Bookmark button
Alert button
Feb 04, 2024
Tao Wang, Wanglong Lu, Kaihao Zhang, Wenhan Luo, Tae-Kyun Kim, Tong Lu, Hongdong Li, Ming-Hsuan Yang

Viaarxiv icon

MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer

Add code
Bookmark button
Alert button
Jan 18, 2024
Changyao Tian, Xizhou Zhu, Yuwen Xiong, Weiyun Wang, Zhe Chen, Wenhai Wang, Yuntao Chen, Lewei Lu, Tong Lu, Jie Zhou, Hongsheng Li, Yu Qiao, Jifeng Dai

Viaarxiv icon

InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks

Add code
Bookmark button
Alert button
Jan 15, 2024
Zhe Chen, Jiannan Wu, Wenhai Wang, Weijie Su, Guo Chen, Sen Xing, Muyan Zhong, Qinglong Zhang, Xizhou Zhu, Lewei Lu, Bin Li, Ping Luo, Tong Lu, Yu Qiao, Jifeng Dai

Viaarxiv icon

Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications

Add code
Bookmark button
Alert button
Jan 11, 2024
Yuwen Xiong, Zhiqi Li, Yuntao Chen, Feng Wang, Xizhou Zhu, Jiapeng Luo, Wenhai Wang, Tong Lu, Hongsheng Li, Yu Qiao, Lewei Lu, Jie Zhou, Jifeng Dai

Viaarxiv icon

CRA-PCN: Point Cloud Completion with Intra- and Inter-level Cross-Resolution Transformers

Add code
Bookmark button
Alert button
Jan 03, 2024
Yi Rong, Haoran Zhou, Lixin Yuan, Cheng Mei, Jiahao Wang, Tong Lu

Viaarxiv icon

Is Ego Status All You Need for Open-Loop End-to-End Autonomous Driving?

Add code
Bookmark button
Alert button
Dec 05, 2023
Zhiqi Li, Zhiding Yu, Shiyi Lan, Jiahan Li, Jan Kautz, Tong Lu, Jose M. Alvarez

Viaarxiv icon

Deep Video Restoration for Under-Display Camera

Add code
Bookmark button
Alert button
Sep 09, 2023
Xuanxi Chen, Tao Wang, Ziqian Shao, Kaihao Zhang, Wenhan Luo, Tong Lu, Zikun Liu, Tae-Kyun Kim, Hongdong Li

Figure 1 for Deep Video Restoration for Under-Display Camera
Figure 2 for Deep Video Restoration for Under-Display Camera
Figure 3 for Deep Video Restoration for Under-Display Camera
Figure 4 for Deep Video Restoration for Under-Display Camera
Viaarxiv icon