Alert button
Picture for Alireza Sanaee

Alireza Sanaee

Alert button

Queen Mary University of London

IPA: Inference Pipeline Adaptation to Achieve High Accuracy and Cost-Efficiency

Add code
Bookmark button
Alert button
Aug 24, 2023
Saeid Ghafouri, Kamran Razavi, Mehran Salmani, Alireza Sanaee, Tania Lorido-Botran, Lin Wang, Joseph Doyle, Pooyan Jamshidi

Figure 1 for IPA: Inference Pipeline Adaptation to Achieve High Accuracy and Cost-Efficiency
Figure 2 for IPA: Inference Pipeline Adaptation to Achieve High Accuracy and Cost-Efficiency
Figure 3 for IPA: Inference Pipeline Adaptation to Achieve High Accuracy and Cost-Efficiency
Figure 4 for IPA: Inference Pipeline Adaptation to Achieve High Accuracy and Cost-Efficiency
Viaarxiv icon

Reconciling High Accuracy, Cost-Efficiency, and Low Latency of Inference Serving Systems

Add code
Bookmark button
Alert button
Apr 24, 2023
Mehran Salmani, Saeid Ghafouri, Alireza Sanaee, Kamran Razavi, Max Mühlhäuser, Joseph Doyle, Pooyan Jamshidi, Mohsen Sharifi

Figure 1 for Reconciling High Accuracy, Cost-Efficiency, and Low Latency of Inference Serving Systems
Figure 2 for Reconciling High Accuracy, Cost-Efficiency, and Low Latency of Inference Serving Systems
Figure 3 for Reconciling High Accuracy, Cost-Efficiency, and Low Latency of Inference Serving Systems
Figure 4 for Reconciling High Accuracy, Cost-Efficiency, and Low Latency of Inference Serving Systems
Viaarxiv icon