Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts

Neural Information Processing Systems 

In particular, it aggregates multiple sub-models called experts based on a gating network. Here, experts can be formulated as neural networks, and they specialize in different aspects of the data.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found