Long-Tail Theory under Gaussian Mixtures

Bolatov, Arman, Tezekbayev, Maxat, Melnykov, Igor, Pak, Artur, Nikoulina, Vassilina, Assylbekov, Zhenisbek

Jul-24-2023–arXiv.org Artificial Intelligence

We suggest a simple Gaussian mixture model for data generation that complies with Feldman's long tail theory (2020). We demonstrate that a linear classifier cannot decrease the generalization error below a certain level in the proposed model, whereas a nonlinear classifier with a memorization capacity can. This confirms that for long-tailed distributions, rare training examples must be considered for optimal generalization to new data. Finally, we show that the performance gap between linear and nonlinear models can be lessened as the tail becomes shorter in the subpopulation frequency distribution, as confirmed by experiments on synthetic and real data.

artificial intelligence, classifier, machine learning, (19 more...)

arXiv.org Artificial Intelligence

Jul-24-2023

arXiv.org PDF

Add feedback

Country:
- Asia
  - Kazakhstan > Akmola Region
    - Astana (0.04)
  - Middle East > Jordan (0.04)
- Europe
  - France (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - United Kingdom > England
    - Greater London > London (0.04)
- North America
  - Canada > Ontario
    - Toronto (0.04)
  - United States
    - Illinois > Cook County
      - Chicago (0.04)
    - Indiana > Allen County
      - Fort Wayne (0.04)
    - Minnesota
      - Saint Louis County > Duluth (0.14)
      - St. Louis County > Duluth (0.14)
    - Washington > King County
      - Seattle (0.04)
- South America > Colombia
  - Meta Department > Villavicencio (0.04)

Genre:
- Research Report > New Finding (0.93)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Inductive Learning (0.87)
  - Learning Graphical Models > Directed Networks
    - Bayesian Learning (0.46)
  - Neural Networks (0.94)
  - Statistical Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found