Alert button

Fake Alignment: Are LLMs Really Aligned Well?

Nov 14, 2023
Yixu Wang, Yan Teng, Kexin Huang, Chengqi Lyu, Songyang Zhang, Wenwei Zhang, Xingjun Ma, Yu-Gang Jiang, Yu Qiao, Yingchun Wang

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: