Alert button
Picture for Tianle Cai

Tianle Cai

Alert button

SnapKV: LLM Knows What You are Looking for Before Generation

Add code
Bookmark button
Alert button
Apr 22, 2024
Yuhong Li, Yingbing Huang, Bowen Yang, Bharat Venkitesh, Acyr Locatelli, Hanchen Ye, Tianle Cai, Patrick Lewis, Deming Chen

Viaarxiv icon

JetMoE: Reaching Llama2 Performance with 0.1M Dollars

Add code
Bookmark button
Alert button
Apr 11, 2024
Yikang Shen, Zhen Guo, Tianle Cai, Zengyi Qin

Viaarxiv icon

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Add code
Bookmark button
Alert button
Mar 07, 2024
Muyang Li, Tianle Cai, Jiaxin Cao, Qinsheng Zhang, Han Cai, Junjie Bai, Yangqing Jia, Ming-Yu Liu, Kai Li, Song Han

Viaarxiv icon

Accelerating Greedy Coordinate Gradient via Probe Sampling

Add code
Bookmark button
Alert button
Mar 02, 2024
Yiran Zhao, Wenyue Zheng, Tianle Cai, Xuan Long Do, Kenji Kawaguchi, Anirudh Goyal, Michael Shieh

Figure 1 for Accelerating Greedy Coordinate Gradient via Probe Sampling
Figure 2 for Accelerating Greedy Coordinate Gradient via Probe Sampling
Figure 3 for Accelerating Greedy Coordinate Gradient via Probe Sampling
Figure 4 for Accelerating Greedy Coordinate Gradient via Probe Sampling
Viaarxiv icon

BitDelta: Your Fine-Tune May Only Be Worth One Bit

Add code
Bookmark button
Alert button
Feb 28, 2024
James Liu, Guangxuan Xiao, Kai Li, Jason D. Lee, Song Han, Tri Dao, Tianle Cai

Viaarxiv icon

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Add code
Bookmark button
Alert button
Jan 19, 2024
Tianle Cai, Yuhong Li, Zhengyang Geng, Hongwu Peng, Jason D. Lee, Deming Chen, Tri Dao

Viaarxiv icon

REST: Retrieval-Based Speculative Decoding

Add code
Bookmark button
Alert button
Nov 14, 2023
Zhenyu He, Zexuan Zhong, Tianle Cai, Jason D Lee, Di He

Viaarxiv icon

Scaling In-Context Demonstrations with Structured Attention

Add code
Bookmark button
Alert button
Jul 05, 2023
Tianle Cai, Kaixuan Huang, Jason D. Lee, Mengdi Wang

Figure 1 for Scaling In-Context Demonstrations with Structured Attention
Figure 2 for Scaling In-Context Demonstrations with Structured Attention
Figure 3 for Scaling In-Context Demonstrations with Structured Attention
Figure 4 for Scaling In-Context Demonstrations with Structured Attention
Viaarxiv icon

Reward Collapse in Aligning Large Language Models

Add code
Bookmark button
Alert button
May 28, 2023
Ziang Song, Tianle Cai, Jason D. Lee, Weijie J. Su

Figure 1 for Reward Collapse in Aligning Large Language Models
Figure 2 for Reward Collapse in Aligning Large Language Models
Figure 3 for Reward Collapse in Aligning Large Language Models
Figure 4 for Reward Collapse in Aligning Large Language Models
Viaarxiv icon

Large Language Models as Tool Makers

Add code
Bookmark button
Alert button
May 26, 2023
Tianle Cai, Xuezhi Wang, Tengyu Ma, Xinyun Chen, Denny Zhou

Figure 1 for Large Language Models as Tool Makers
Figure 2 for Large Language Models as Tool Makers
Figure 3 for Large Language Models as Tool Makers
Figure 4 for Large Language Models as Tool Makers
Viaarxiv icon