AITopics | Chugg, Ben

Collaborating Authors

Chugg, Ben

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A unified recipe for deriving (time-uniform) PAC-Bayes bounds

Chugg, Ben, Wang, Hongjian, Ramdas, Aaditya

arXiv.org Machine LearningJan-3-2024

We present a unified framework for deriving PAC-Bayesian generalization bounds. Unlike most previous literature on this topic, our bounds are anytime-valid (i.e., time-uniform), meaning that they hold at all stopping times, not only for a fixed sample size. Our approach combines four tools in the following order: (a) nonnegative supermartingales or reverse submartingales, (b) the method of mixtures, (c) the Donsker-Varadhan formula (or other convex duality principles), and (d) Ville's inequality. Our main result is a PAC-Bayes theorem which holds for a wide class of discrete stochastic processes. We show how this result implies time-uniform versions of well-known classical PAC-Bayes bounds, such as those of Seeger, McAllester, Maurer, and Catoni, in addition to many recent bounds. We also present several novel bounds. Our framework also enables us to relax traditional assumptions; in particular, we consider nonstationary loss functions and non-i.i.d. data. In sum, we unify the derivation of past bounds and ease the search for future bounds: one may simply check if our supermartingale or submartingale conditions are met and, if so, be guaranteed a (time-uniform) PAC-Bayes bound.

artificial intelligence, machine learning, supermartingale, (18 more...)

arXiv.org Machine Learning

2302.03421

Country: North America > United States (0.27)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

Time-Uniform Confidence Spheres for Means of Random Vectors

Chugg, Ben, Wang, Hongjian, Ramdas, Aaditya

arXiv.org Machine LearningNov-14-2023

We derive and study time-uniform confidence spheres - termed confidence sphere sequences (CSSs) - which contain the mean of random vectors with high probability simultaneously across all sample sizes. Inspired by the original work of Catoni and Giulini, we unify and extend their analysis to cover both the sequential setting and to handle a variety of distributional assumptions. More concretely, our results include an empirical-Bernstein CSS for bounded random vectors (resulting in a novel empirical-Bernstein confidence interval), a CSS for sub-$\psi$ random vectors, and a CSS for heavy-tailed random vectors based on a sequentially valid Catoni-Giulini estimator. Finally, we provide a version of our empirical-Bernstein CSS that is robust to contamination by Huber noise.

artificial intelligence, machine learning, sequence, (16 more...)

arXiv.org Machine Learning

2311.08168

Country: North America > United States (0.14)

Genre: Research Report (0.84)

Technology:

Information Technology > Data Science (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Auditing Fairness by Betting

Chugg, Ben, Cortes-Gomez, Santiago, Wilder, Bryan, Ramdas, Aaditya

arXiv.org Machine LearningOct-29-2023

As algorithmic decision-making continues to increase in prevalence across both the private and public sectors [1, 2], there has been an increasing push to scrutinize the fairness of these systems. This has lead to an explosion of interest in so-called "algorithmic fairness", and a significant body of work has focused on both defining fairness and training models in fair ways (e.g., [3-5]). However, preventing and redressing harms in real systems also requires the ability to audit models in order to assess their impact; such algorithmic audits are an increasing area of focus for researchers and practitioners [6-9]. Auditing may begin during model development [10], but as model behavior often changes over time throughout real-world deployment in response to distribution shift or model updates [11, 12], it is often necessary to repeatedly audit the performance of algorithmic systems over time [8, 13]. Detecting whether deployed models continue to meet various fairness criteria is of paramount importance to deciding whether an automated decision-making system continues to act reliably and whether intervention is necessary. In this work, we consider the perspective of an auditor or auditing agency tasked with determining if a model deployed "in the wild" is fair or not. Data concerning the system's decisions are gathered over time (perhaps with the purpose of testing fairness but perhaps for another purpose) and our goal is to determine if there is sufficient evidence to conclude that the system is unfair. If a system is in fact unfair, we want to determine so as early as possible, both in order to avert harms to users and because auditing may require expensive investment to collect or label samples [8, 13]. Following the recent work of Taskesen et al. [14] and Si et al. [15], a natural statistical framework for thinking about this problem is hypothesis testing.

artificial intelligence, machine learning, sequential test, (18 more...)

arXiv.org Machine Learning

2305.1757

Country: North America > United States (0.67)

Genre: Research Report (0.82)

Industry:

Health & Medicine (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Integrating Reward Maximization and Population Estimation: Sequential Decision-Making for Internal Revenue Service Audit Selection

Henderson, Peter, Chugg, Ben, Anderson, Brandon, Altenburger, Kristen, Turk, Alex, Guyton, John, Goldin, Jacob, Ho, Daniel E.

arXiv.org Artificial IntelligenceJan-24-2023

We introduce a new setting, optimize-and-estimate structured bandits. Here, a policy must select a batch of arms, each characterized by its own context, that would allow it to both maximize reward and maintain an accurate (ideally unbiased) population estimate of the reward. This setting is inherent to many public and private sector applications and often requires handling delayed feedback, small data, and distribution shifts. We demonstrate its importance on real data from the United States Internal Revenue Service (IRS). The IRS performs yearly audits of the tax base. Two of its most important objectives are to identify suspected misreporting and to estimate the "tax gap" -- the global difference between the amount paid and true amount owed. Based on a unique collaboration with the IRS, we cast these two processes as a unified optimize-and-estimate structured bandit. We analyze optimize-and-estimate approaches to the IRS problem and propose a novel mechanism for unbiased population estimation that achieves rewards comparable to baseline approaches. This approach has the potential to improve audit efficacy, while maintaining policy-relevant estimates of the tax gap. This has important social consequences given that the current tax gap is estimated at nearly half a trillion dollars. We suggest that this problem setting is fertile ground for further research and we highlight its interesting challenges. The results of this and related research are currently being incorporated into the continual improvement of the IRS audit selection methods.

data mining, machine learning, reinforcement learning, (22 more...)

arXiv.org Artificial Intelligence

2204.1191

Country: North America > United States (1.00)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.67)

Industry:

Government > Tax (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
(2 more...)

Add feedback