Statistical Mechanics of the Mixture of Experts

Kang, Kukjin, Oh, Jong-Hoon

Neural Information Processing Systems 

The mixture of experts [1, 2] is a well known example which implements the philosophy of divide-and-conquer elegantly. Whereas this model are gaining more popularity in various applications, there have been little efforts to evaluate generalization capability of these modular approaches theoretically. Here we present the first analytic study of generalization in the mixture of experts from the statistical 184 K. Kang and 1. Oh physics perspective. Use of statistical mechanics formulation have been focused on the study of feedforward neural network architectures close to the multilayer perceptron[5, 6], together with the VC theory[8]. We expect that the statistical mechanics approach can also be effectively used to evaluate more advanced architectures including mixture models.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found