Alert button
Picture for Lingxiao Ma

Lingxiao Ma

Alert button

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Add code
Bookmark button
Alert button
Feb 27, 2024
Shuming Ma, Hongyu Wang, Lingxiao Ma, Lei Wang, Wenhui Wang, Shaohan Huang, Li Dong, Ruiping Wang, Jilong Xue, Furu Wei

Viaarxiv icon

BitNet: Scaling 1-bit Transformers for Large Language Models

Add code
Bookmark button
Alert button
Oct 17, 2023
Hongyu Wang, Shuming Ma, Li Dong, Shaohan Huang, Huaijie Wang, Lingxiao Ma, Fan Yang, Ruiping Wang, Yi Wu, Furu Wei

Viaarxiv icon

FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement

Add code
Bookmark button
Alert button
Apr 08, 2023
Xiaonan Nie, Xupeng Miao, Zilong Wang, Zichao Yang, Jilong Xue, Lingxiao Ma, Gang Cao, Bin Cui

Figure 1 for FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement
Figure 2 for FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement
Figure 3 for FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement
Figure 4 for FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement
Viaarxiv icon

SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation

Add code
Bookmark button
Alert button
Jan 26, 2023
Ningxin Zheng, Huiqiang Jiang, Quanlu Zhang, Zhenhua Han, Yuqing Yang, Lingxiao Ma, Fan Yang, Lili Qiu, Mao Yang, Lidong Zhou

Figure 1 for SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation
Figure 2 for SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation
Figure 3 for SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation
Figure 4 for SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation
Viaarxiv icon

Dense-to-Sparse Gate for Mixture-of-Experts

Add code
Bookmark button
Alert button
Dec 29, 2021
Xiaonan Nie, Shijie Cao, Xupeng Miao, Lingxiao Ma, Jilong Xue, Youshan Miao, Zichao Yang, Zhi Yang, Bin Cui

Figure 1 for Dense-to-Sparse Gate for Mixture-of-Experts
Figure 2 for Dense-to-Sparse Gate for Mixture-of-Experts
Figure 3 for Dense-to-Sparse Gate for Mixture-of-Experts
Figure 4 for Dense-to-Sparse Gate for Mixture-of-Experts
Viaarxiv icon

Architectural Implications of Graph Neural Networks

Add code
Bookmark button
Alert button
Sep 02, 2020
Zhihui Zhang, Jingwen Leng, Lingxiao Ma, Youshan Miao, Chao Li, Minyi Guo

Figure 1 for Architectural Implications of Graph Neural Networks
Figure 2 for Architectural Implications of Graph Neural Networks
Figure 3 for Architectural Implications of Graph Neural Networks
Figure 4 for Architectural Implications of Graph Neural Networks
Viaarxiv icon

Towards Efficient Large-Scale Graph Neural Network Computing

Add code
Bookmark button
Alert button
Oct 19, 2018
Lingxiao Ma, Zhi Yang, Youshan Miao, Jilong Xue, Ming Wu, Lidong Zhou, Yafei Dai

Figure 1 for Towards Efficient Large-Scale Graph Neural Network Computing
Figure 2 for Towards Efficient Large-Scale Graph Neural Network Computing
Figure 3 for Towards Efficient Large-Scale Graph Neural Network Computing
Figure 4 for Towards Efficient Large-Scale Graph Neural Network Computing
Viaarxiv icon