Alert button

Efficient Memory Management for Large Language Model Serving with PagedAttention

Add code
Bookmark button
Alert button
Sep 12, 2023
Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph E. Gonzalez, Hao Zhang, Ion Stoica

Figure 1 for Efficient Memory Management for Large Language Model Serving with PagedAttention
Figure 2 for Efficient Memory Management for Large Language Model Serving with PagedAttention
Figure 3 for Efficient Memory Management for Large Language Model Serving with PagedAttention
Figure 4 for Efficient Memory Management for Large Language Model Serving with PagedAttention

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: