An efficient, provably exact, practical algorithm for the 0-1 loss linear classification problem

He, Xi, Rahman, Waheed Ul, Little, Max A.

Aug-2-2023–arXiv.org Artificial Intelligence

There has been an increasing trend to leverage machine learning (ML) for high-stakes prediction applications that deeply impact human lives. Many of these ML models are "black boxes" with highly complex, inscrutable functional forms. In high-stakes applications such as healthcare and criminal justice, black box ML predictions have incorrectly denied parole [Wexler, 2017], misclassified highly polluted air as safe to breathe [McGough, 2018], and suggested poor allocation of valuable, limited resources in medicine and energy reliability [Varshney and Alemzadeh, 2017]. In such high-stakes applications of ML, we always want the best possible prediction, and we want to know how the model makes these predictions so that we can be confident the predictions are meaningful [Rudin, 2022]. In short, the ideal model is simple enough to be easily understood (interpretable), and optimally accurate (exact). Hence, in high-stakes applications of ML, we always want the best possible prediction, and we want to know how the model makes these predictions so that we can be confident the predictions are meaningful. In short, the ideal model is simple enough to understand and optimally accurate, then our interpretations of the results can be faithful to what our model actually computes. Another compelling reason why simple models are preferable is because such low complexity models usually provide better statistical generality, in the sense that a classifier fit to some training dataset, will work well on another dataset drawn from the same distribution to which we do not have access (works well out-of-sample). The VC dimension is a key measure of the complexity of a classification model.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

Aug-2-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
  - Georgia > Fulton County
    - Atlanta (0.04)
- Europe > United Kingdom
  - England > West Midlands > Birmingham (0.04)

Genre:
- Research Report > New Finding (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning
    - Optimization (1.00)
    - Mathematical & Statistical Methods (0.67)
  - Machine Learning
    - Statistical Learning (1.00)
    - Computational Learning Theory (0.66)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found