MomentumSMoE: Integrating Momentum into Sparse Mixture of Experts
–Neural Information Processing Systems
We theoretically prove and numerically demonstrate that MomentumSMoE is more stable and robust than SMoE.
Neural Information Processing Systems
Nov-15-2025, 14:06:53 GMT
- Country:
- Asia
- Middle East > Jordan (0.04)
- Russia (0.04)
- Singapore (0.04)
- Europe
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Russia (0.04)
- Ireland > Leinster
- North America > United States
- Minnesota > Hennepin County > Minneapolis (0.14)
- South America > Colombia
- Meta Department > Villavicencio (0.04)
- Asia
- Genre:
- Research Report > Experimental Study (0.93)
- Industry:
- Government (0.67)
- Information Technology (0.67)
- Technology:
- Information Technology
- Artificial Intelligence
- Machine Learning
- Neural Networks > Deep Learning (1.00)
- Statistical Learning (0.94)
- Natural Language > Chatbot (0.68)
- Representation & Reasoning > Optimization (1.00)
- Vision (1.00)
- Machine Learning
- Communications (0.93)
- Artificial Intelligence
- Information Technology