Alert button
Picture for Zhaowei Zhang

Zhaowei Zhang

Alert button

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

Add code
Bookmark button
Alert button
Apr 15, 2024
Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, Benjamin L. Edelman, Zhaowei Zhang, Mario Günther, Anton Korinek, Jose Hernandez-Orallo, Lewis Hammond, Eric Bigelow, Alexander Pan, Lauro Langosco, Tomasz Korbak, Heidi Zhang, Ruiqi Zhong, Seán Ó hÉigeartaigh, Gabriel Recchia, Giulio Corsi, Alan Chan, Markus Anderljung, Lilian Edwards, Yoshua Bengio, Danqi Chen, Samuel Albanie, Tegan Maharaj, Jakob Foerster, Florian Tramer, He He, Atoosa Kasirzadeh, Yejin Choi, David Krueger

Viaarxiv icon

Incentive Compatibility for AI Alignment in Sociotechnical Systems: Positions and Prospects

Add code
Bookmark button
Alert button
Mar 01, 2024
Zhaowei Zhang, Fengshuo Bai, Mingzhi Wang, Haoyang Ye, Chengdong Ma, Yaodong Yang

Viaarxiv icon

CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents

Add code
Bookmark button
Alert button
Jan 19, 2024
Siyuan Qi, Shuo Chen, Yexin Li, Xiangyu Kong, Junqi Wang, Bangcheng Yang, Pring Wong, Yifan Zhong, Xiaoyuan Zhang, Zhaowei Zhang, Nian Liu, Wei Wang, Yaodong Yang, Song-Chun Zhu

Viaarxiv icon

AI Alignment: A Comprehensive Survey

Add code
Bookmark button
Alert button
Nov 01, 2023
Jiaming Ji, Tianyi Qiu, Boyuan Chen, Borong Zhang, Hantao Lou, Kaile Wang, Yawen Duan, Zhonghao He, Jiayi Zhou, Zhaowei Zhang, Fanzhi Zeng, Kwan Yee Ng, Juntao Dai, Xuehai Pan, Aidan O'Gara, Yingshan Lei, Hua Xu, Brian Tse, Jie Fu, Stephen McAleer, Yaodong Yang, Yizhou Wang, Song-Chun Zhu, Yike Guo, Wen Gao

Viaarxiv icon

Measuring Value Understanding in Language Models through Discriminator-Critique Gap

Add code
Bookmark button
Alert button
Oct 19, 2023
Zhaowei Zhang, Fengshuo Bai, Jun Gao, Yaodong Yang

Figure 1 for Measuring Value Understanding in Language Models through Discriminator-Critique Gap
Figure 2 for Measuring Value Understanding in Language Models through Discriminator-Critique Gap
Figure 3 for Measuring Value Understanding in Language Models through Discriminator-Critique Gap
Figure 4 for Measuring Value Understanding in Language Models through Discriminator-Critique Gap
Viaarxiv icon

ProAgent: Building Proactive Cooperative AI with Large Language Models

Add code
Bookmark button
Alert button
Aug 28, 2023
Ceyao Zhang, Kaijie Yang, Siyi Hu, Zihao Wang, Guanghe Li, Yihang Sun, Cheng Zhang, Zhaowei Zhang, Anji Liu, Song-Chun Zhu, Xiaojun Chang, Junge Zhang, Feng Yin, Yitao Liang, Yaodong Yang

Figure 1 for ProAgent: Building Proactive Cooperative AI with Large Language Models
Figure 2 for ProAgent: Building Proactive Cooperative AI with Large Language Models
Figure 3 for ProAgent: Building Proactive Cooperative AI with Large Language Models
Figure 4 for ProAgent: Building Proactive Cooperative AI with Large Language Models
Viaarxiv icon

Heterogeneous Value Evaluation for Large Language Models

Add code
Bookmark button
Alert button
Jun 01, 2023
Zhaowei Zhang, Nian Liu, Siyuan Qi, Ceyao Zhang, Ziqi Rong, Song-Chun Zhu, Shuguang Cui, Yaodong Yang

Figure 1 for Heterogeneous Value Evaluation for Large Language Models
Figure 2 for Heterogeneous Value Evaluation for Large Language Models
Figure 3 for Heterogeneous Value Evaluation for Large Language Models
Figure 4 for Heterogeneous Value Evaluation for Large Language Models
Viaarxiv icon

STAS: Spatial-Temporal Return Decomposition for Multi-agent Reinforcement Learning

Add code
Bookmark button
Alert button
Apr 15, 2023
Sirui Chen, Zhaowei Zhang, Yali Du, Yaodong Yang

Figure 1 for STAS: Spatial-Temporal Return Decomposition for Multi-agent Reinforcement Learning
Figure 2 for STAS: Spatial-Temporal Return Decomposition for Multi-agent Reinforcement Learning
Figure 3 for STAS: Spatial-Temporal Return Decomposition for Multi-agent Reinforcement Learning
Figure 4 for STAS: Spatial-Temporal Return Decomposition for Multi-agent Reinforcement Learning
Viaarxiv icon

Contextual Transformer for Offline Meta Reinforcement Learning

Add code
Bookmark button
Alert button
Nov 15, 2022
Runji Lin, Ye Li, Xidong Feng, Zhaowei Zhang, Xian Hong Wu Fung, Haifeng Zhang, Jun Wang, Yali Du, Yaodong Yang

Figure 1 for Contextual Transformer for Offline Meta Reinforcement Learning
Figure 2 for Contextual Transformer for Offline Meta Reinforcement Learning
Figure 3 for Contextual Transformer for Offline Meta Reinforcement Learning
Figure 4 for Contextual Transformer for Offline Meta Reinforcement Learning
Viaarxiv icon

Continuous Decomposition of Granularity for Neural Paraphrase Generation

Add code
Bookmark button
Alert button
Sep 16, 2022
Xiaodong Gu, Zhaowei Zhang, Sang-Woo Lee, Kang Min Yoo, Jung-Woo Ha

Figure 1 for Continuous Decomposition of Granularity for Neural Paraphrase Generation
Figure 2 for Continuous Decomposition of Granularity for Neural Paraphrase Generation
Figure 3 for Continuous Decomposition of Granularity for Neural Paraphrase Generation
Figure 4 for Continuous Decomposition of Granularity for Neural Paraphrase Generation
Viaarxiv icon