Alert button

Self-Play Preference Optimization for Language Model Alignment

May 01, 2024
Yue Wu, Zhiqing Sun, Huizhuo Yuan, Kaixuan Ji, Yiming Yang, Quanquan Gu

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: