Alert button

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Apr 02, 2024
David Raposo, Sam Ritter, Blake Richards, Timothy Lillicrap, Peter Conway Humphreys, Adam Santoro

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: