wealth process
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Virginia (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > France (0.04)
- Health & Medicine (1.00)
- Government (0.93)
- Law (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
- Information Technology > Data Science > Data Mining (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Eventually LIL Regret: Almost Sure $\ln\ln T$ Regret for a sub-Gaussian Mixture on Unbounded Data
Agrawal, Shubhada, Ramdas, Aaditya
We prove that a classic sub-Gaussian mixture proposed by Robbins in a stochastic setting actually satisfies a path-wise (deterministic) regret bound. For every path in a natural ``Ville event'' $E_α$, this regret till time $T$ is bounded by $\ln^2(1/α)/V_T + \ln (1/α) + \ln \ln V_T$ up to universal constants, where $V_T$ is a nonnegative, nondecreasing, cumulative variance process. (The bound reduces to $\ln(1/α) + \ln \ln V_T$ if $V_T \geq \ln(1/α)$.) If the data were stochastic, then one can show that $E_α$ has probability at least $1-α$ under a wide class of distributions (eg: sub-Gaussian, symmetric, variance-bounded, etc.). In fact, we show that on the Ville event $E_0$ of probability one, the regret on every path in $E_0$ is eventually bounded by $\ln \ln V_T$ (up to constants). We explain how this work helps bridge the world of adversarial online learning (which usually deals with regret bounds for bounded data), with game-theoretic statistics (which can handle unbounded data, albeit using stochastic assumptions). In short, conditional regret bounds serve as a bridge between stochastic and adversarial betting.
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Virginia (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > France (0.04)
- Health & Medicine (1.00)
- Government (0.93)
- Law (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
- Information Technology > Data Science > Data Mining (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
On Continuous Monitoring of Risk Violations under Unknown Shift
Timans, Alexander, Verma, Rajeev, Nalisnick, Eric, Naesseth, Christian A.
Machine learning systems deployed in the real world must operate under dynamic and often unpredictable distribution shifts. This challenges the validity of statistical safety assurances on the system's risk established beforehand. Common risk control frameworks rely on fixed assumptions and lack mechanisms to continuously monitor deployment reliability. In this work, we propose a general framework for the real-time monitoring of risk violations in evolving data streams. Leveraging the 'testing by betting' paradigm, we propose a sequential hypothesis testing procedure to detect violations of bounded risks associated with the model's decision-making mechanism, while ensuring control on the false alarm rate. Our method operates under minimal assumptions on the nature of encountered shifts, rendering it broadly applicable. We illustrate the effectiveness of our approach by monitoring risks in outlier detection and set prediction under a variety of shifts.
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Asia > Middle East > Jordan (0.04)
On the near-optimality of betting confidence sets for bounded means
Shekhar, Shubhanshu, Ramdas, Aaditya
Constructing nonasymptotic confidence intervals (CIs) for the mean of a univariate distribution from independent and identically distributed (i.i.d.) observations is a fundamental task in statistics. For bounded observations, a classical nonparametric approach proceeds by inverting standard concentration bounds, such as Hoeffding's or Bernstein's inequalities. Recently, an alternative betting-based approach for defining CIs and their time-uniform variants called confidence sequences (CSs), has been shown to be empirically superior to the classical methods. In this paper, we provide theoretical justification for this improved empirical performance of betting CIs and CSs. Our main contributions are as follows: (i) We first compare CIs using the values of their first-order asymptotic widths (scaled by $\sqrt{n}$), and show that the betting CI of Waudby-Smith and Ramdas (2023) has a smaller limiting width than existing empirical Bernstein (EB)-CIs. (ii) Next, we establish two lower bounds that characterize the minimum width achievable by any method for constructing CIs/CSs in terms of certain inverse information projections. (iii) Finally, we show that the betting CI and CS match the fundamental limits, modulo an additive logarithmic term and a multiplicative constant. Overall these results imply that the betting CI~(and CS) admit stronger theoretical guarantees than the existing state-of-the-art EB-CI~(and CS); both in the asymptotic and finite-sample regimes.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Deep anytime-valid hypothesis testing
Pandeva, Teodora, Forré, Patrick, Ramdas, Aaditya, Shekhar, Shubhanshu
We propose a general framework for constructing powerful, sequential hypothesis tests for a large class of nonparametric testing problems. The null hypothesis for these problems is defined in an abstract form using the action of two known operators on the data distribution. This abstraction allows for a unified treatment of several classical tasks, such as two-sample testing, independence testing, and conditional-independence testing, as well as modern problems, such as testing for adversarial robustness of machine learning (ML) models. Our proposed framework has the following advantages over classical batch tests: 1) it continuously monitors online data streams and efficiently aggregates evidence against the null, 2) it provides tight control over the type I error without the need for multiple testing correction, 3) it adapts the sample size requirement to the unknown hardness of the problem. We develop a principled approach of leveraging the representation capability of ML models within the testing-by-betting framework, a game-theoretic approach for designing sequential tests. Empirical results on synthetic and real-world datasets demonstrate that tests instantiated using our general framework are competitive against specialized baselines on several tasks.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
Auditing Fairness by Betting
Chugg, Ben, Cortes-Gomez, Santiago, Wilder, Bryan, Ramdas, Aaditya
As algorithmic decision-making continues to increase in prevalence across both the private and public sectors [1, 2], there has been an increasing push to scrutinize the fairness of these systems. This has lead to an explosion of interest in so-called "algorithmic fairness", and a significant body of work has focused on both defining fairness and training models in fair ways (e.g., [3-5]). However, preventing and redressing harms in real systems also requires the ability to audit models in order to assess their impact; such algorithmic audits are an increasing area of focus for researchers and practitioners [6-9]. Auditing may begin during model development [10], but as model behavior often changes over time throughout real-world deployment in response to distribution shift or model updates [11, 12], it is often necessary to repeatedly audit the performance of algorithmic systems over time [8, 13]. Detecting whether deployed models continue to meet various fairness criteria is of paramount importance to deciding whether an automated decision-making system continues to act reliably and whether intervention is necessary. In this work, we consider the perspective of an auditor or auditing agency tasked with determining if a model deployed "in the wild" is fair or not. Data concerning the system's decisions are gathered over time (perhaps with the purpose of testing fairness but perhaps for another purpose) and our goal is to determine if there is sufficient evidence to conclude that the system is unfair. If a system is in fact unfair, we want to determine so as early as possible, both in order to avert harms to users and because auditing may require expensive investment to collect or label samples [8, 13]. Following the recent work of Taskesen et al. [14] and Si et al. [15], a natural statistical framework for thinking about this problem is hypothesis testing.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Virginia (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > France (0.04)
- Health & Medicine (1.00)
- Government (1.00)
Sequential Predictive Two-Sample and Independence Testing
Podkopaev, Aleksandr, Ramdas, Aaditya
We study the problems of sequential nonparametric two-sample and independence testing. Sequential tests process data online and allow using observed data to decide whether to stop and reject the null hypothesis or to collect more data, while maintaining type I error control. We build upon the principle of (nonparametric) testing by betting, where a gambler places bets on future observations and their wealth measures evidence against the null hypothesis. While recently developed kernel-based betting strategies often work well on simple distributions, selecting a suitable kernel for high-dimensional or structured data, such as images, is often nontrivial. To address this drawback, we design prediction-based betting strategies that rely on the following fact: if a sequentially updated predictor starts to consistently determine (a) which distribution an instance is drawn from, or (b) whether an instance is drawn from the joint distribution or the product of the marginal distributions (the latter produced by external randomization), it provides evidence against the two-sample or independence nulls respectively. We empirically demonstrate the superiority of our tests over kernel-based approaches under structured settings. Our tests can be applied beyond the case of independent and identically distributed data, remaining valid and powerful even when the data distribution drifts over time.
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)