Alert button
Picture for Seliem El-Sayed

Seliem El-Sayed

Alert button

A Mechanism-Based Approach to Mitigating Harms from Persuasive Generative AI

Add code
Bookmark button
Alert button
Apr 23, 2024
Seliem El-Sayed, Canfer Akbulut, Amanda McCroskery, Geoff Keeling, Zachary Kenton, Zaria Jalan, Nahema Marchal, Arianna Manzini, Toby Shevlane, Shannon Vallor, Daniel Susser, Matija Franklin, Sophie Bridgers, Harry Law, Matthew Rahtz, Murray Shanahan, Michael Henry Tessler, Arthur Douillard, Tom Everitt, Sasha Brown

Viaarxiv icon

Evaluating Frontier Models for Dangerous Capabilities

Add code
Bookmark button
Alert button
Mar 20, 2024
Mary Phuong, Matthew Aitchison, Elliot Catt, Sarah Cogan, Alexandre Kaskasoli, Victoria Krakovna, David Lindner, Matthew Rahtz, Yannis Assael, Sarah Hodkinson, Heidi Howard, Tom Lieberum, Ramana Kumar, Maria Abi Raad, Albert Webson, Lewis Ho, Sharon Lin, Sebastian Farquhar, Marcus Hutter, Gregoire Deletang, Anian Ruoss, Seliem El-Sayed, Sasha Brown, Anca Dragan, Rohin Shah, Allan Dafoe, Toby Shevlane

Figure 1 for Evaluating Frontier Models for Dangerous Capabilities
Figure 2 for Evaluating Frontier Models for Dangerous Capabilities
Figure 3 for Evaluating Frontier Models for Dangerous Capabilities
Figure 4 for Evaluating Frontier Models for Dangerous Capabilities
Viaarxiv icon