Evaluation of Adaptive Mixtures of Competing Experts

Nowlan, Steven J., Hinton, Geoffrey E.

Neural Information Processing Systems 

We compare the performance of the modular architecture, composed of competing expert networks, suggested by Jacobs, Jordan, Nowlan and Hinton (1991) to the performance of a single back-propagation network on a complex, but low-dimensional, vowel recognition task. Simulations reveal that this system is capable of uncovering interesting decompositions in a complex task. The type of decomposition is strongly influenced by the nature of the input to the gating network that decides which expert to use for each case. The modular architecture also exhibits consistently better generalization on many variations of the task. 1 Introduction If back-propagation is used to train a single, multilayer network to perform different subtasks on different occasions, there will generally be strong interference effects which lead to slow learning and poor generalization. If we know in advance that a set of training cases may be naturally divideJ into subsets that correspond to distinct subtasks, interference can be reduced by using a system (see Figure 1) composed of several different "expert" networks plus a gating network that decides which of the experts should be used for each training case. Systems of this type have been suggested by a number of authors (Hampshire and Waibel, 1989; Jacobs, Jordan and Barto, 1990; Jacobs et al., 1991) (see also the paper by Jacobs and Jordan in this volume (1991».

Similar Docs  Excel Report  more

TitleSimilaritySource
None found