An Alternative Model for Mixtures of Experts
Xu, Lei, Jordan, Michael I., Hinton, Geoffrey E.
–Neural Information Processing Systems
We propose an alternative model for mixtures of experts which uses a different parametric form for the gating network. The modified model is trained by the EM algorithm. In comparison with earlier models-trained by either EM or gradient ascent-there is no need to select a learning stepsize. We report simulation experiments which show that the new architecture yields faster convergence. We also apply the new model to two problem domains: piecewise nonlinear function approximation and the combination of multiple previously trained classifiers. 1 INTRODUCTION For the mixtures of experts architecture (Jacobs, Jordan, Nowlan & Hinton, 1991), the EM algorithm decouples the learning process in a manner that fits well with the modular structure and yields a considerably improved rate of convergence (Jordan & Jacobs, 1994).
Neural Information Processing Systems
Dec-31-1995
- Country:
- North America
- United States
- Oregon > Multnomah County
- Portland (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- California > San Mateo County
- San Mateo (0.05)
- Oregon > Multnomah County
- Canada > Ontario
- Toronto (0.14)
- United States
- Asia
- Middle East > Jordan (0.53)
- China > Hong Kong (0.05)
- North America
- Technology: