Alert button
Picture for Zeming Wei

Zeming Wei

Alert button

Boosting Jailbreak Attack with Momentum

Add code
Bookmark button
Alert button
May 02, 2024
Yihao Zhang, Zeming Wei

Viaarxiv icon

Exploring the Robustness of In-Context Learning with Noisy Labels

Add code
Bookmark button
Alert button
May 01, 2024
Chen Cheng, Xinzhi Yu, Haodong Wen, Jingsong Sun, Guanzhang Yue, Yihao Zhang, Zeming Wei

Viaarxiv icon

Towards General Conceptual Model Editing via Adversarial Representation Engineering

Add code
Bookmark button
Alert button
Apr 21, 2024
Yihao Zhang, Zeming Wei, Jun Sun, Meng Sun

Viaarxiv icon

On the Duality Between Sharpness-Aware Minimization and Adversarial Training

Add code
Bookmark button
Alert button
Feb 23, 2024
Yihao Zhang, Hangzhou He, Jingyu Zhu, Huanran Chen, Yifei Wang, Zeming Wei

Viaarxiv icon

Studious Bob Fight Back Against Jailbreaking via Prompt Adversarial Tuning

Add code
Bookmark button
Alert button
Feb 09, 2024
Yichuan Mo, Yuji Wang, Zeming Wei, Yisen Wang

Viaarxiv icon

Jatmo: Prompt Injection Defense by Task-Specific Finetuning

Add code
Bookmark button
Alert button
Jan 08, 2024
Julien Piet, Maha Alrashed, Chawin Sitawarin, Sizhe Chen, Zeming Wei, Elizabeth Sun, Basel Alomair, David Wagner

Viaarxiv icon

Architecture Matters: Uncovering Implicit Mechanisms in Graph Contrastive Learning

Add code
Bookmark button
Alert button
Nov 05, 2023
Xiaojun Guo, Yifei Wang, Zeming Wei, Yisen Wang

Viaarxiv icon

Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations

Add code
Bookmark button
Alert button
Oct 10, 2023
Zeming Wei, Yifei Wang, Yisen Wang

Figure 1 for Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations
Figure 2 for Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations
Figure 3 for Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations
Figure 4 for Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations
Viaarxiv icon

Weighted Automata Extraction and Explanation of Recurrent Neural Networks for Natural Language Tasks

Add code
Bookmark button
Alert button
Jun 24, 2023
Zeming Wei, Xiyue Zhang, Yihao Zhang, Meng Sun

Figure 1 for Weighted Automata Extraction and Explanation of Recurrent Neural Networks for Natural Language Tasks
Figure 2 for Weighted Automata Extraction and Explanation of Recurrent Neural Networks for Natural Language Tasks
Figure 3 for Weighted Automata Extraction and Explanation of Recurrent Neural Networks for Natural Language Tasks
Figure 4 for Weighted Automata Extraction and Explanation of Recurrent Neural Networks for Natural Language Tasks
Viaarxiv icon

On the Relation between Sharpness-Aware Minimization and Adversarial Robustness

Add code
Bookmark button
Alert button
May 09, 2023
Zeming Wei, Jingyu Zhu, Yihao Zhang

Figure 1 for On the Relation between Sharpness-Aware Minimization and Adversarial Robustness
Figure 2 for On the Relation between Sharpness-Aware Minimization and Adversarial Robustness
Figure 3 for On the Relation between Sharpness-Aware Minimization and Adversarial Robustness
Viaarxiv icon