Rethinking Fano's Inequality in Ensemble Learning

Morishita, Terufumi, Morio, Gaku, Horiguchi, Shota, Ozaki, Hiroaki, Nukaga, Nobuo

Nov-16-2023–arXiv.org Machine Learning

The central question of ensemble learning has been: what factors make an ensemble system good or bad? It has We propose a fundamental theory on ensemble been widely believed that accurate and diverse models lead learning that answers the central question: what to better performance for ensemble systems. Guided by factors make an ensemble system good or bad? this intuition, many heuristical metrics have been proposed Previous studies used a variant of Fano's inequality to measure accuracy and diversity (Kohavi et al., 1996; of information theory and derived a lower Skalak et al., 1996; Cunningham & Carney, 2000; Shipp bound of the classification error rate on the basis & Kuncheva, 2002). However, these metrics lack theoretical of the accuracy and diversity of models. We grounding, and indeed, Kuncheva & Whitaker (2003) revisit the original Fano's inequality and argue empirically showed that there are no connections between that the studies did not take into account the information the metrics and system performance through a broad range lost when multiple model predictions of experiments. Turning to theoretical viewpoints, Geman are combined into a final prediction. To address et al. (1992) decomposed the squared error loss used in regression this issue, we generalize the previous theory to tasks into the bias and covariance of models. Bias incorporate the information loss, which we name here corresponds to accuracy and covariance diversity.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

Nov-16-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Maryland (0.14)
- Asia > Japan
  - Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre:
- Research Report
  - New Finding (0.68)
  - Experimental Study (0.46)

Industry:
- Materials (0.46)
- Energy > Oil & Gas (0.46)

Technology:
- Information Technology
  - Information Management (0.87)
  - Artificial Intelligence
    - Natural Language (1.00)
    - Machine Learning
      - Statistical Learning (0.68)
      - Neural Networks > Deep Learning (0.45)
      - Performance Analysis > Accuracy (0.39)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found