AITopics | Duchi, John C.

Collaborating Authors

Duchi, John C.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On Privately Estimating a Single Parameter

Asi, Hilal, Duchi, John C., Talwar, Kunal

arXiv.org Artificial IntelligenceMar-21-2025

We investigate differentially private estimators for individual parameters within larger parametric models. While generic private estimators exist, the estimators we provide repose on new local notions of estimand stability, and these notions allow procedures that provide private certificates of their own stability. By leveraging these private certificates, we provide computationally and statistical efficient mechanisms that release private statistics that are, at least asymptotically in the sample size, essentially unimprovable: they achieve instance optimal bounds. Additionally, we investigate the practicality of the algorithms both in simulated data and in real-world data from the American Community Survey and US Census, highlighting scenarios in which the new procedures are successful and identifying areas for future work.

artificial intelligence, machine learning, reg, (19 more...)

arXiv.org Artificial Intelligence

2503.17252

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

Predictive Inference in Multi-environment Scenarios

Duchi, John C., Gupta, Suyash, Jiang, Kuanhao, Sur, Pragya

arXiv.org Machine LearningMar-24-2024

We address the challenge of constructing valid confidence intervals and sets in problems of prediction across multiple environments. We investigate two types of coverage suitable for these problems, extending the jackknife and split-conformal methods to show how to obtain distribution-free coverage in such non-traditional, hierarchical data-generating scenarios. Our contributions also include extensions for settings with non-real-valued responses and a theory of consistency for predictive inference in these general problems. We demonstrate a novel resizing method to adapt to problem difficulty, which applies both to existing approaches for predictive inference with hierarchical data and the methods we develop; this reduces prediction set sizes using limited information from the test environment, a key to the methods' practical performance, which we evaluate through neurochemical sensing and species classification datasets.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

2403.16336

Country: North America > United States > California > Santa Clara County > Palo Alto (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

PPI++: Efficient Prediction-Powered Inference

Angelopoulos, Anastasios N., Duchi, John C., Zrnic, Tijana

arXiv.org Machine LearningNov-2-2023

We present PPI++: a computationally lightweight methodology for estimation and inference based on a small labeled dataset and a typically much larger dataset of machine-learning predictions. The methods automatically adapt to the quality of available predictions, yielding easy-to-compute confidence sets -- for parameters of any dimensionality -- that always improve on classical intervals using only the labeled data. PPI++ builds on prediction-powered inference (PPI), which targets the same problem setting, improving its computational and statistical efficiency. Real and synthetic experiments demonstrate the benefits of the proposed adaptations.

artificial intelligence, inference, machine learning, (17 more...)

arXiv.org Machine Learning

2311.01453

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

The Lifecycle of a Statistical Model: Model Failure Detection, Identification, and Refitting

Ali, Alnur, Cauchois, Maxime, Duchi, John C.

arXiv.org Machine LearningFeb-8-2022

The statistical machine learning community has demonstrated considerable resourcefulness over the years in developing highly expressive tools for estimation, prediction, and inference. The bedrock assumptions underlying these developments are that the data comes from a fixed population and displays little heterogeneity. But reality is significantly more complex: statistical models now routinely fail when released into real-world systems and scientific applications, where such assumptions rarely hold. Consequently, we pursue a different path in this paper vis-a-vis the well-worn trail of developing new methodology for estimation and prediction. In this paper, we develop tools and theory for detecting and identifying regions of the covariate space (subpopulations) where model performance has begun to degrade, and study intervening to fix these failures through refitting. We present empirical results with three real-world data sets -- including a time series involving forecasting the incidence of COVID-19 -- showing that our methodology generates interpretable results, is useful for tracking model performance, and can boost model performance through refitting. We complement these empirical results with theory proving that our methodology is minimax optimal for recovering anomalous subpopulations as well as refitting to improve accuracy in a structured normal means setting.

artificial intelligence, data mining, machine learning, (23 more...)

arXiv.org Machine Learning

2202.04166

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Epidemiology (0.48)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.35)
Health & Medicine > Therapeutic Area > Immunology (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.45)

Add feedback

Accelerated, Optimal, and Parallel: Some Results on Model-Based Stochastic Optimization

Chadha, Karan, Cheng, Gary, Duchi, John C.

arXiv.org Machine LearningJan-7-2021

We extend the Approximate-Proximal Point (aProx) family of model-based methods for solving stochastic convex optimization problems, including stochastic subgradient, proximal point, and bundle methods, to the minibatch and accelerated setting. To do so, we propose specific model-based algorithms and an acceleration scheme for which we provide non-asymptotic convergence guarantees, which are order-optimal in all problem-dependent constants and provide linear speedup in minibatch size, while maintaining the desirable robustness traits (e.g. to stepsize) of the aProx family. Additionally, we show improved convergence rates and matching lower bounds identifying new fundamental constants for "interpolation" problems, whose importance in statistical machine learning is growing; this, for example, gives a parallelization strategy for alternating projections. We corroborate our theoretical results with empirical testing to demonstrate the gains accurate modeling, acceleration, and minibatching provide.

accelerated, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2101.02696

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Large-Scale Methods for Distributionally Robust Optimization

Levy, Daniel, Carmon, Yair, Duchi, John C., Sidford, Aaron

arXiv.org Machine LearningOct-12-2020

We propose and analyze algorithms for distributionally robust optimization of convex losses with conditional value at risk (CVaR) and $\chi^2$ divergence uncertainty sets. We prove that our algorithms require a number of gradient evaluations independent of training set size and number of parameters, making them suitable for large-scale applications. For $\chi^2$ uncertainty sets these are the first such guarantees in the literature, and for CVaR our guarantees scale linearly in the uncertainty level rather than quadratically as in previous work. We also provide lower bounds proving the worst-case optimality of our algorithms for CVaR and a penalized version of the $\chi^2$ problem. Our primary technical contributions are novel bounds on the bias of batch robust risk estimation and the variance of a multilevel Monte Carlo gradient estimator due to [Blanchet & Glynn, 2015]. Experiments on MNIST and ImageNet confirm the theoretical scaling of our algorithms, which are 9--36 times more efficient than full-batch methods.

cvar, neural network, optimization problem, (19 more...)

arXiv.org Machine Learning

2010.05893

Country:

Europe > United Kingdom > England (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Vision (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

Robust Validation: Confident Predictions Even When Distributions Shift

Cauchois, Maxime, Gupta, Suyash, Ali, Alnur, Duchi, John C.

arXiv.org Machine LearningAug-10-2020

While the traditional viewpoint in machine learning and statistics assumes training and testing samples come from the same population, practice belies this fiction. One strategy---coming from robust statistics and optimization---is thus to build a model robust to distributional perturbations. In this paper, we take a different approach to describe procedures for robust predictive inference, where a model provides uncertainty estimates on its predictions rather than point predictions. We present a method that produces prediction sets (almost exactly) giving the right coverage level for any test distribution in an $f$-divergence ball around the training population. The method, based on conformal inference, achieves (nearly) valid coverage in finite samples, under only the condition that the training data be exchangeable. An essential component of our methodology is to estimate the amount of expected future data shift and build robustness to it; we develop estimators and prove their consistency for protection and validity of uncertainty estimates under shifts. By experimenting on several large-scale benchmark datasets, including Recht et al.'s CIFAR-v4 and ImageNet-V2 datasets, we provide complementary empirical results that highlight the importance of robust predictive validity.

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Machine Learning

2008.04267

Country:

North America > United States > California (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations

Arjevani, Yossi, Carmon, Yair, Duchi, John C., Foster, Dylan J., Sekhari, Ayush, Sridharan, Karthik

arXiv.org Machine LearningJun-24-2020

We design an algorithm which finds an $\epsilon$-approximate stationary point (with $\|\nabla F(x)\|\le \epsilon$) using $O(\epsilon^{-3})$ stochastic gradient and Hessian-vector products, matching guarantees that were previously available only under a stronger assumption of access to multiple queries with the same random seed. We prove a lower bound which establishes that this rate is optimal and---surprisingly---that it cannot be improved using stochastic $p$th order methods for any $p\ge 2$, even when the first $p$ derivatives of the objective are Lipschitz. Together, these results characterize the complexity of non-convex stochastic optimization with second-order methods and beyond. Expanding our scope to the oracle complexity of finding $(\epsilon,\gamma)$-approximate second-order stationary points, we establish nearly matching upper and lower bounds for stochastic second-order methods. Our lower bounds here are novel even in the noiseless case.

artificial intelligence, estimator, machine learning, (17 more...)

arXiv.org Machine Learning

2006.13476

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Add feedback

Unsupervised Transformation Learning via Convex Relaxations

Hashimoto, Tatsunori B., Liang, Percy S., Duchi, John C.

Neural Information Processing SystemsFeb-14-2020, 19:41:08 GMT

Our goal is to extract meaningful transformations from raw images, such as varying the thickness of lines in handwriting or the lighting in a portrait. We propose an unsupervised approach to learn such transformations by attempting to reconstruct an image from a linear combination of transformations of its nearest neighbors. On handwritten digits and celebrity portraits, we show that even with linear transformations, our method generates visually high-quality modified images. Moreover, since our method is semiparametric and does not model the data distribution, the learned transformations extrapolate off the training data and can be applied to new types of images. Papers published at the Neural Information Processing Systems Conference.

artificial intelligence, machine learning, transformation, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.78)

Add feedback

Necessary and Sufficient Conditions for Adaptive, Mirror, and Standard Gradient Methods

Levy, Daniel, Duchi, John C.

arXiv.org Machine LearningSep-23-2019

We study the impact of the constraint set and gradient geometry on the convergence of online and stochastic methods for convex optimization, providing a characterization of the geometries for which stochastic gradient and adaptive gradient methods are (minimax) optimal. In particular, we show that when the constraint set is quadratically convex, diagonally pre-conditioned stochastic gradient methods are minimax optimal. We further provide a converse that shows that when the constraints are not quadratically convex---for example, any $\ell_p$-ball for $p < 2$---the methods are far from optimal. Based on this, we can provide concrete recommendations for when one should use adaptive, mirror or stochastic gradient methods.

artificial intelligence, machine learning, null, (17 more...)

arXiv.org Machine Learning

1909.10455

Country:

North America > United States (0.14)
Asia > Middle East (0.14)

Genre: Research Report (0.64)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.89)

Add feedback