Alert button
Picture for Christopher Potts

Christopher Potts

Alert button

ReFT: Representation Finetuning for Language Models

Add code
Bookmark button
Alert button
Apr 08, 2024
Zhengxuan Wu, Aryaman Arora, Zheng Wang, Atticus Geiger, Dan Jurafsky, Christopher D. Manning, Christopher Potts

Viaarxiv icon

Mapping the Increasing Use of LLMs in Scientific Papers

Add code
Bookmark button
Alert button
Apr 01, 2024
Weixin Liang, Yaohui Zhang, Zhengxuan Wu, Haley Lepp, Wenlong Ji, Xuandong Zhao, Hancheng Cao, Sheng Liu, Siyu He, Zhi Huang, Diyi Yang, Christopher Potts, Christopher D Manning, James Y. Zou

Viaarxiv icon

pyvene: A Library for Understanding and Improving PyTorch Models via Interventions

Add code
Bookmark button
Alert button
Mar 12, 2024
Zhengxuan Wu, Atticus Geiger, Aryaman Arora, Jing Huang, Zheng Wang, Noah D. Goodman, Christopher D. Manning, Christopher Potts

Figure 1 for pyvene: A Library for Understanding and Improving PyTorch Models via Interventions
Figure 2 for pyvene: A Library for Understanding and Improving PyTorch Models via Interventions
Figure 3 for pyvene: A Library for Understanding and Improving PyTorch Models via Interventions
Viaarxiv icon

RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations

Add code
Bookmark button
Alert button
Feb 27, 2024
Jing Huang, Zhengxuan Wu, Christopher Potts, Mor Geva, Atticus Geiger

Viaarxiv icon

CommVQA: Situating Visual Question Answering in Communicative Contexts

Add code
Bookmark button
Alert button
Feb 22, 2024
Nandita Shankar Naik, Christopher Potts, Elisa Kreiss

Viaarxiv icon

CausalGym: Benchmarking causal interpretability methods on linguistic tasks

Add code
Bookmark button
Alert button
Feb 19, 2024
Aryaman Arora, Dan Jurafsky, Christopher Potts

Viaarxiv icon

A Reply to Makelov et al. (2023)'s "Interpretability Illusion" Arguments

Add code
Bookmark button
Alert button
Jan 23, 2024
Zhengxuan Wu, Atticus Geiger, Jing Huang, Aryaman Arora, Thomas Icard, Christopher Potts, Noah D. Goodman

Viaarxiv icon

In-Context Learning for Extreme Multi-Label Classification

Add code
Bookmark button
Alert button
Jan 22, 2024
Karel D'Oosterlinck, Omar Khattab, François Remy, Thomas Demeester, Chris Develder, Christopher Potts

Viaarxiv icon

Mission: Impossible Language Models

Add code
Bookmark button
Alert button
Jan 12, 2024
Julie Kallini, Isabel Papadimitriou, Richard Futrell, Kyle Mahowald, Christopher Potts

Viaarxiv icon

I am a Strange Dataset: Metalinguistic Tests for Language Models

Add code
Bookmark button
Alert button
Jan 10, 2024
Tristan Thrush, Jared Moore, Miguel Monares, Christopher Potts, Douwe Kiela

Viaarxiv icon