Alert button

Towards Pareto Optimal Throughput in Small Language Model Serving

Apr 04, 2024
Pol G. Recasens, Yue Zhu, Chen Wang, Eun Kyung Lee, Olivier Tardieu, Alaa Youssef, Jordi Torres, Josep Ll. Berral

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: