AITopics | Kuchibhotla, Arun Kumar

Collaborating Authors

Kuchibhotla, Arun Kumar

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Extrapolated cross-validation for randomized ensembles

Du, Jin-Hong, Patil, Pratik, Roeder, Kathryn, Kuchibhotla, Arun Kumar

arXiv.org Machine LearningDec-15-2023

Ensemble methods such as bagging and random forests are ubiquitous in various fields, from finance to genomics. Despite their prevalence, the question of the efficient tuning of ensemble parameters has received relatively little attention. This paper introduces a cross-validation method, ECV (Extrapolated Cross-Validation), for tuning the ensemble and subsample sizes in randomized ensembles. Our method builds on two primary ingredients: initial estimators for small ensemble sizes using out-of-bag errors and a novel risk extrapolation technique that leverages the structure of prediction risk decomposition. By establishing uniform consistency of our risk extrapolation technique over ensemble and subsample sizes, we show that ECV yields $\delta$-optimal (with respect to the oracle-tuned risk) ensembles for squared prediction risk. Our theory accommodates general ensemble predictors, only requires mild moment assumptions, and allows for high-dimensional regimes where the feature dimension grows with the sample size. As a practical case study, we employ ECV to predict surface protein abundances from gene expressions in single-cell multiomics using random forests. In comparison to sample-split cross-validation and $K$-fold cross-validation, ECV achieves higher accuracy avoiding sample splitting. At the same time, its computational cost is considerably lower owing to the use of the risk extrapolation technique. Additional numerical results validate the finite-sample accuracy of ECV for several common ensemble predictors under a computational constraint on the maximum ensemble size.

artificial intelligence, machine learning, predictor, (20 more...)

arXiv.org Machine Learning

doi: 10.1080/10618600.2023.2288194

2302.13511

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (1.00)

Add feedback

Bagging in overparameterized learning: Risk characterization and risk monotonization

Patil, Pratik, Du, Jin-Hong, Kuchibhotla, Arun Kumar

arXiv.org Machine LearningOct-24-2023

Bagging is a commonly used ensemble technique in statistics and machine learning to improve the performance of prediction procedures. In this paper, we study the prediction risk of variants of bagged predictors under the proportional asymptotics regime, in which the ratio of the number of features to the number of observations converges to a constant. Specifically, we propose a general strategy to analyze the prediction risk under squared error loss of bagged predictors using classical results on simple random sampling. Specializing the strategy, we derive the exact asymptotic risk of the bagged ridge and ridgeless predictors with an arbitrary number of bags under a well-specified linear model with arbitrary feature covariance matrices and signal vectors. Furthermore, we prescribe a generic cross-validation procedure to select the optimal subsample size for bagging and discuss its utility to eliminate the non-monotonic behavior of the limiting risk in the sample size (i.e., double or multiple descents). In demonstrating the proposed procedure for bagged ridge and ridgeless predictors, we thoroughly investigate the oracle properties of the optimal subsample size and provide an in-depth comparison between different bagging variants.

artificial intelligence, machine learning, predictor, (19 more...)

arXiv.org Machine Learning

2210.11445

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California > Alameda County > Berkeley (0.13)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Subsample Ridge Ensembles: Equivalences and Generalized Cross-Validation

Du, Jin-Hong, Patil, Pratik, Kuchibhotla, Arun Kumar

arXiv.org Artificial IntelligenceJul-16-2023

We study subsampling-based ridge ensembles in the proportional asymptotics regime, where the feature size grows proportionally with the sample size such that their ratio converges to a constant. By analyzing the squared prediction risk of ridge ensembles as a function of the explicit penalty $\lambda$ and the limiting subsample aspect ratio $\phi_s$ (the ratio of the feature size to the subsample size), we characterize contours in the $(\lambda, \phi_s)$-plane at any achievable risk. As a consequence, we prove that the risk of the optimal full ridgeless ensemble (fitted on all possible subsamples) matches that of the optimal ridge predictor. In addition, we prove strong uniform consistency of generalized cross-validation (GCV) over the subsample sizes for estimating the prediction risk of ridge ensembles. This allows for GCV-based tuning of full ridgeless ensembles without sample splitting and yields a predictor whose risk matches optimal ridge risk.

artificial intelligence, equivalence and generalized, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2304.13016

Country:

North America > United States > California (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Hawaii (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Immunology (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.63)

Add feedback

Post-selection Inference for Conformal Prediction: Trading off Coverage for Precision

Sarkar, Siddhaarth, Kuchibhotla, Arun Kumar

arXiv.org Machine LearningJun-30-2023

Conformal inference has played a pivotal role in providing uncertainty quantification for black-box ML prediction algorithms with finite sample guarantees. Traditionally, conformal prediction inference requires a data-independent specification of miscoverage level. In practical applications, one might want to update the miscoverage level after computing the prediction set. For example, in the context of binary classification, the analyst might start with a 95$\%$ prediction sets and see that most prediction sets contain all outcome classes. Prediction sets with both classes being undesirable, the analyst might desire to consider, say 80$\%$ prediction set. Construction of prediction sets that guarantee coverage with data-dependent miscoverage level can be considered as a post-selection inference problem. In this work, we develop simultaneous conformal inference to account for data-dependent miscoverage levels. Under the assumption of independent and identically distributed observations, our proposed methods have a finite sample simultaneous guarantee over all miscoverage levels. This allows practitioners to trade freely coverage probability for the quality of the prediction set by any criterion of their choice (say size of prediction set) while maintaining the finite sample guarantees similar to traditional conformal inference.

artificial intelligence, machine learning, prediction, (16 more...)

arXiv.org Machine Learning

2304.06158

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)

Add feedback

Mitigating multiple descents: A model-agnostic framework for risk monotonization

Patil, Pratik, Kuchibhotla, Arun Kumar, Wei, Yuting, Rinaldo, Alessandro

arXiv.org Machine LearningMay-25-2022

Recent empirical and theoretical analyses of several commonly used prediction procedures reveal a peculiar risk behavior in high dimensions, referred to as double/multiple descent, in which the asymptotic risk is a non-monotonic function of the limiting aspect ratio of the number of features or parameters to the sample size. To mitigate this undesirable behavior, we develop a general framework for risk monotonization based on cross-validation that takes as input a generic prediction procedure and returns a modified procedure whose out-of-sample prediction risk is, asymptotically, monotonic in the limiting aspect ratio. As part of our framework, we propose two data-driven methodologies, namely zero- and one-step, that are akin to bagging and boosting, respectively, and show that, under very mild assumptions, they provably achieve monotonic asymptotic risk behavior. Our results are applicable to a broad variety of prediction procedures and loss functions, and do not require a well-specified (parametric) model. We exemplify our framework with concrete analyses of the minimum $\ell_2$, $\ell_1$-norm least squares prediction procedures. As one of the ingredients in our analysis, we also derive novel additive and multiplicative forms of oracle risk inequalities for split cross-validation that are of independent interest.

artificial intelligence, machine learning, mitigating multiple descent, (2 more...)

arXiv.org Machine Learning

2205.12937

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.73)

Add feedback

Improving Fairness in Criminal Justice Algorithmic Risk Assessments Using Conformal Prediction Sets

Berk, Richard A., Kuchibhotla, Arun Kumar

arXiv.org Machine LearningAug-26-2020

Risk assessment algorithms have been correctly criticized for potential unfairness, and there is an active cottage industry trying to make repairs. In this paper, we adopt a framework from conformal prediction sets to remove unfairness from risk algorithms themselves and the covariates used for forecasting. From a sample of 300,000 offenders at their arraignments, we construct a confusion table and its derived measures of fairness that are effectively free any meaningful differences between Black and White offenders. We also produce fair forecasts for individual offenders coupled with valid probability guarantees that the forecasted outcome is the true outcome. We see our work as a demonstration of concept for application in a wide variety of criminal justice decisions. The procedures provided can be routinely implemented in jurisdictions with the usual criminal justice datasets used by administrators. The requisite procedures can be found in the scripting software R. However, whether stakeholders will accept our approach as a means to achieve risk assessment fairness is unknown. There also are legal issues that would need to be resolved although we offer a Pareto improvement.

artificial intelligence, data mining, offender, (19 more...)

arXiv.org Machine Learning

2008.11664

Country: North America > United States (0.93)

Genre: Research Report (1.00)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (0.91)

Technology:

Information Technology > Data Science > Data Mining (0.93)
Information Technology > Security & Privacy (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Near-Optimal Confidence Sequences for Bounded Random Variables

Kuchibhotla, Arun Kumar, Zheng, Qinqing

arXiv.org Artificial IntelligenceJun-8-2020

Many inference problems, such as sequential decision problems like A/B testing, adaptive sampling schemes like bandit selection, are often online in nature. The fundamental problem for online inference is to provide a sequence of confidence intervals that are valid uniformly over the growing-into-infinity sample sizes. To address this question, we provide a near-optimal confidence sequence for bounded random variables by utilizing Bentkus' concentration results. We show that it improves on the existing approaches that use the Cram{\'e}r-Chernoff technique such as the Hoeffding, Bernstein, and Bennett inequalities. The resulting confidence sequence is confirmed to be favorable in both synthetic coverage problems and an application to adaptive stopping algorithms.

artificial intelligence, confidence sequence, data mining, (20 more...)

arXiv.org Artificial Intelligence

2006.05022

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Data Science > Data Mining (0.67)

Add feedback

Exchangeability, Conformal Prediction, and Rank Tests

Kuchibhotla, Arun Kumar

arXiv.org Machine LearningMay-12-2020

Although these two concepts are very closely related, the fact that exchangeability allows for a specific type of dependence between the random variables leads to numerous implications/applications of this concept. One of the most important implications of exchangeability is that the indexing of random variables is immaterial. In technical words, this means that the ranks of real-valued exchangeable random variables are uniform over the set of all permutations. Just this one implication has pioneered two very different fields in statistics and machine learning, namely, nonparametric rank tests and conformal prediction. The main purpose of this article is to define exchangeability, discuss its implications (rigorously), and then exposit the uses of this concept for conformal prediction and rank tests. To our knowledge, conformal prediction (starting from Vovk et al. (2005)) is the first field to apply the full strength of exchangeability.

cardiology, random variable, vascular disease, (22 more...)

arXiv.org Machine Learning

2005.06095

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.67)
Health & Medicine > Therapeutic Area > Endocrinology (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)

Add feedback

Deterministic Inequalities for Smooth M-estimators

Kuchibhotla, Arun Kumar

arXiv.org Machine LearningSep-13-2018

Ever since the proof of asymptotic normality of maximum likelihood estimator by Cramer (1946), it has been understood that a basic technique of the Taylor series expansion suffices for asymptotics of $M$-estimators with smooth/differentiable loss function. Although the Taylor series expansion is a purely deterministic tool, the realization that the asymptotic normality results can also be made deterministic (and so finite sample) received far less attention. With the advent of big data and high-dimensional statistics, the need for finite sample results has increased. In this paper, we use the (well-known) Banach fixed point theorem to derive various deterministic inequalities that lead to the classical results when studied under randomness. In addition, we provide applications of these deterministic inequalities for crossvalidation/subsampling, marginal screening and uniform-in-submodel results that are very useful for post-selection inference and in the study of post-regularization estimators. Our results apply to many classical estimators, in particular, generalized linear models, non-linear regression and cox proportional hazards model. Extensions to non-smooth and constrained problems are also discussed.

artificial intelligence, deterministic inequality, machine learning, (15 more...)

arXiv.org Machine Learning

1809.05172

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Moving Beyond Sub-Gaussianity in High-Dimensional Statistics: Applications in Covariance Estimation and Linear Regression

Kuchibhotla, Arun Kumar, Chakrabortty, Abhishek

arXiv.org Machine LearningApr-7-2018

Concentration inequalities form an essential toolkit in the study of high-dimensional statistical methods. Most of the relevant statistics literature is based on the assumptions of sub-Gaussian/sub-exponential random vectors. In this paper, we bring together various probability inequalities for sums of independent random variables under much weaker exponential type (sub-Weibull) tail assumptions. These results extract a part sub-Gaussian tail behavior in finite samples, matching the asymptotics governed by the central limit theorem, and are compactly represented in terms of a new Orlicz quasi-norm - the Generalized Bernstein-Orlicz norm - that typifies such tail behaviors. We illustrate the usefulness of these inequalities through the analysis of four fundamental problems in high-dimensional statistics. In the first two problems, we study the rate of convergence of the sample covariance matrix in terms of the maximum elementwise norm and the maximum k-sub-matrix operator norm which are key quantities of interest in bootstrap procedures and high-dimensional structured covariance matrix estimation. The third example concerns the restricted eigenvalue condition, required in high dimensional linear regression, which we verify for all sub-Weibull random vectors under only marginal (not joint) tail assumptions on the covariates. To our knowledge, this is the first unified result obtained in such generality. In the final example, we consider the Lasso estimator for linear regression and establish its rate of convergence under much weaker tail assumptions (on the errors as well as the covariates) than those in the existing literature. The common feature in all our results is that the convergence rates under most exponential tails match the usual ones under sub-Gaussian assumptions. Finally, we also establish a high-dimensional CLT and tail bounds for empirical processes for sub-Weibulls.

artificial intelligence, inequality, survey article, (17 more...)

arXiv.org Machine Learning

1804.02605

Country: North America > United States > Pennsylvania (0.14)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback