Picture for Huy Nguyen

Huy Nguyen

Statistical Advantages of Perturbing Cosine Router in Sparse Mixture of Experts

May 23, 2024
Viaarxiv icon

Mixture of Experts Meets Prompt-Based Continual Learning

May 23, 2024
Viaarxiv icon

Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts

May 22, 2024
Figure 1 for Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts
Figure 2 for Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts
Figure 3 for Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts
Figure 4 for Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts
Viaarxiv icon

On Parameter Estimation in Deviated Gaussian Mixture of Experts

Feb 07, 2024
Viaarxiv icon

FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion

Add code
Feb 05, 2024
Viaarxiv icon

On Least Squares Estimation in Softmax Gating Mixture of Experts

Feb 05, 2024
Viaarxiv icon

CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition

Add code
Feb 04, 2024
Figure 1 for CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition
Figure 2 for CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition
Figure 3 for CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition
Figure 4 for CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition
Viaarxiv icon

Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?

Jan 25, 2024
Viaarxiv icon

AG-ReID.v2: Bridging Aerial and Ground Views for Person Re-identification

Add code
Jan 05, 2024
Viaarxiv icon

A General Theory for Softmax Gating Multinomial Logistic Mixture of Experts

Oct 22, 2023
Viaarxiv icon