Mixture of Many Zero-Compute Experts: A High-Rate Quantization Theory Perspective

Oct-6-2025–arXiv.org Artificial Intelligence

This paper uses classical high-rate quantization theory to provide new insights into mixture-of-experts (MoE) models for regression tasks. Our MoE is defined by a segmentation of the input space to regions, each with a single-parameter expert that acts as a constant predictor with zero-compute at inference. Motivated by high-rate quantization theory a ssumptions, we assume that the number of experts is sufficiently large to make their input-space re gions very small. This lets us to study the approximation error of our MoE model class: (i) for one-dime nsional inputs, we formulate the test error and its minimizing segmentation and experts; (ii) for multidimensional inputs, we formulate an upper bound for the test error and study its minimization. Moreover, we consider the learning of the expert parameters from a training dataset, given an in put-space segmentation, and formulate their statistical learning properties. This leads us to the oretically and empirically show how the tradeoff between approximation and estimation errors in Mo E learning depends on the number of experts.

artificial intelligence, machine learning, segmentation, (15 more...)

arXiv.org Artificial Intelligence

Oct-6-2025

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report (0.40)

Industry:
- Education > Educational Setting > Online (0.34)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks (0.67)
  - Statistical Learning (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found