Alert button
Picture for Shang Yang

Shang Yang

Alert button

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

Add code
Bookmark button
Alert button
May 07, 2024
Yujun Lin, Haotian Tang, Shang Yang, Zhekai Zhang, Guangxuan Xiao, Chuang Gan, Song Han

Viaarxiv icon

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Add code
Bookmark button
Alert button
Jun 01, 2023
Ji Lin, Jiaming Tang, Haotian Tang, Shang Yang, Xingyu Dang, Song Han

Figure 1 for AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Figure 2 for AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Figure 3 for AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Figure 4 for AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Viaarxiv icon

FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer

Add code
Bookmark button
Alert button
Jan 20, 2023
Zhijian Liu, Xinyu Yang, Haotian Tang, Shang Yang, Song Han

Figure 1 for FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer
Figure 2 for FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer
Figure 3 for FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer
Figure 4 for FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer
Viaarxiv icon