Alert button
Picture for Siyuan Huang

Siyuan Huang

Alert button

Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V

Add code
Bookmark button
Alert button
Apr 16, 2024
Peiyuan Zhi, Zhiyuan Zhang, Muzhi Han, Zeyu Zhang, Zhitian Li, Ziyuan Jiao, Baoxiong Jia, Siyuan Huang

Viaarxiv icon

PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI

Add code
Bookmark button
Alert button
Apr 15, 2024
Yandan Yang, Baoxiong Jia, Peiyuan Zhi, Siyuan Huang

Viaarxiv icon

Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want

Add code
Bookmark button
Alert button
Apr 01, 2024
Weifeng Lin, Xinyu Wei, Ruichuan An, Peng Gao, Bocheng Zou, Yulin Luo, Siyuan Huang, Shanghang Zhang, Hongsheng Li

Figure 1 for Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Figure 2 for Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Figure 3 for Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Figure 4 for Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Viaarxiv icon

Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Scene Affordance

Add code
Bookmark button
Alert button
Mar 26, 2024
Zan Wang, Yixin Chen, Baoxiong Jia, Puhao Li, Jinlu Zhang, Jingze Zhang, Tengyu Liu, Yixin Zhu, Wei Liang, Siyuan Huang

Figure 1 for Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Scene Affordance
Figure 2 for Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Scene Affordance
Figure 3 for Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Scene Affordance
Figure 4 for Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Scene Affordance
Viaarxiv icon

AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents

Add code
Bookmark button
Alert button
Mar 19, 2024
Jieming Cui, Tengyu Liu, Nian Liu, Yaodong Yang, Yixin Zhu, Siyuan Huang

Figure 1 for AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents
Figure 2 for AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents
Figure 3 for AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents
Figure 4 for AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents
Viaarxiv icon

ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models

Add code
Bookmark button
Alert button
Mar 17, 2024
Siyuan Huang, Iaroslav Ponomarenko, Zhengkai Jiang, Xiaoqi Li, Xiaobin Hu, Peng Gao, Hongsheng Li, Hao Dong

Figure 1 for ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models
Figure 2 for ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models
Figure 3 for ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models
Figure 4 for ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models
Viaarxiv icon

Scaling Up Dynamic Human-Scene Interaction Modeling

Add code
Bookmark button
Alert button
Mar 13, 2024
Nan Jiang, Zhiyuan Zhang, Hongjie Li, Xiaoxuan Ma, Zan Wang, Yixin Chen, Tengyu Liu, Yixin Zhu, Siyuan Huang

Figure 1 for Scaling Up Dynamic Human-Scene Interaction Modeling
Figure 2 for Scaling Up Dynamic Human-Scene Interaction Modeling
Figure 3 for Scaling Up Dynamic Human-Scene Interaction Modeling
Figure 4 for Scaling Up Dynamic Human-Scene Interaction Modeling
Viaarxiv icon

Graph Parsing Networks

Add code
Bookmark button
Alert button
Feb 22, 2024
Yunchong Song, Siyuan Huang, Xinbing Wang, Chenghu Zhou, Zhouhan Lin

Viaarxiv icon

Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models

Add code
Bookmark button
Alert button
Feb 22, 2024
Xudong Lu, Qi Liu, Yuhui Xu, Aojun Zhou, Siyuan Huang, Bo Zhang, Junchi Yan, Hongsheng Li

Viaarxiv icon

SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models

Add code
Bookmark button
Alert button
Feb 08, 2024
Peng Gao, Renrui Zhang, Chris Liu, Longtian Qiu, Siyuan Huang, Weifeng Lin, Shitian Zhao, Shijie Geng, Ziyi Lin, Peng Jin, Kaipeng Zhang, Wenqi Shao, Chao Xu, Conghui He, Junjun He, Hao Shao, Pan Lu, Hongsheng Li, Yu Qiao

Viaarxiv icon