AITopics | Block, Adam

Collaborating Authors

Block, Adam

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Sample Complexity of Approximate Rejection Sampling with Applications to Smoothed Online Learning

Block, Adam, Polyanskiy, Yury

arXiv.org Artificial IntelligenceJun-27-2023

Suppose we are given access to $n$ independent samples from distribution $\mu$ and we wish to output one of them with the goal of making the output distributed as close as possible to a target distribution $\nu$. In this work we show that the optimal total variation distance as a function of $n$ is given by $\tilde\Theta(\frac{D}{f'(n)})$ over the class of all pairs $\nu,\mu$ with a bounded $f$-divergence $D_f(\nu\|\mu)\leq D$. Previously, this question was studied only for the case when the Radon-Nikodym derivative of $\nu$ with respect to $\mu$ is uniformly bounded. We then consider an application in the seemingly very different field of smoothed online learning, where we show that recent results on the minimax regret and the regret of oracle-efficient algorithms still hold even under relaxed constraints on the adversary (to have bounded $f$-divergence, as opposed to bounded Radon-Nikodym derivative). Finally, we also study efficacy of importance sampling for mean estimates uniform over a function class and compare importance sampling with rejection sampling.

artificial intelligence, machine learning, rejection, (17 more...)

arXiv.org Artificial Intelligence

2302.04658

Country: North America > United States (0.45)

Genre: Research Report > New Finding (0.67)

Industry: Education > Educational Setting > Online (0.62)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)

Add feedback

Oracle-Efficient Smoothed Online Learning for Piecewise Continuous Decision Making

Block, Adam, Rakhlin, Alexander, Simchowitz, Max

arXiv.org Artificial IntelligenceFeb-10-2023

The online learning setting has become the most popular regime for studying sequential decision making with dependent and potentially adversarial data. While this paradigm is attractive due to its great generality and minimal set of assumptions [Cesa-Bianchi and Lugosi, 2006], the worstcase nature of the adversary creates statistical and computational challenges [Rakhlin et al., 2015, Littlestone, 1988, Hazan and Koren, 2016]. In order to mitigate these difficulties, Rakhlin et al. [2011] proposed the smoothed setting, wherein the adversary is constrained to sample data from a distribution whose likelihood ratio is bounded above by 1/σ with respect to a fixed dominating measure, which ensures that the adversary cannot choose worst-case inputs with high probability. As in other online learning settings, performance is measured via regret with respect to a best-inhindsight comparator [Cesa-Bianchi and Lugosi, 2006]. Recent works have demonstrated strong computational-statistical tradeoffs in smoothed online learning: while there are statisticaly efficient algorithms that can enjoy regret logarithmic in 1/σ, oracle-efficient algorithms necessarily suffer regret scaling polynomially in 1/σ [Haghtalab et al., 2022a,b, Block et al., 2022], where the learner is assumed access to an Empirical Risk Minimization (ERM) oracle that is able to efficiently optimize functionals on the parameter space. This gap is significant, because in many applications of interest, the natural scaling of σ is exponential in ambient problem dimension [Block and Simchowitz, 2022]. A natural question remains: under which types of smoothing is it possible to design oracleefficient algorithms with regret that scales polynomially in problem dimension? A partial answer was provided by Block and Simchowitz [2022], who demonstrate an efficient algorithm based on the John Ellipsoid which attains log(T/σ) poly(dimension)-regret for noiseless linear classification, and for a suitable generalization to classification with polynomial features.

artificial intelligence, inequality follow, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2302.0543

Country:

Europe (1.00)
North America > United States (0.67)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Smoothed Online Learning for Prediction in Piecewise Affine Systems

Block, Adam, Simchowitz, Max, Tedrake, Russ

arXiv.org Artificial IntelligenceJan-26-2023

The problem of piecewise affine (PWA) regression and planning is of foundational importance to the study of online learning, control, and robotics, where it provides a theoretically and empirically tractable setting to study systems undergoing sharp changes in the dynamics. Unfortunately, due to the discontinuities that arise when crossing into different ``pieces,'' learning in general sequential settings is impossible and practical algorithms are forced to resort to heuristic approaches. This paper builds on the recently developed smoothed online learning framework and provides the first algorithms for prediction and simulation in PWA systems whose regret is polynomial in all relevant problem parameters under a weak smoothness assumption; moreover, our algorithms are efficient in the number of calls to an optimization oracle. We further apply our results to the problems of one-step prediction and multi-step simulation regret in piecewise affine dynamical systems, where the learner is tasked with simulating trajectories and regret is measured in terms of the Wasserstein distance between simulated and true data. Along the way, we develop several technical tools of more general interest.

artificial intelligence, machine learning, probability, (16 more...)

arXiv.org Artificial Intelligence

2301.11187

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.81)
(2 more...)

Add feedback

Smoothed Online Learning is as Easy as Statistical Learning

Block, Adam, Dagan, Yuval, Golowich, Noah, Rakhlin, Alexander

arXiv.org Machine LearningFeb-9-2022

Much of modern learning theory has been split between two regimes: the classical \emph{offline} setting, where data arrive independently, and the \emph{online} setting, where data arrive adversarially. While the former model is often both computationally and statistically tractable, the latter requires no distributional assumptions. In an attempt to achieve the best of both worlds, previous work proposed the smooth online setting where each sample is drawn from an adversarially chosen distribution, which is smooth, i.e., it has a bounded density with respect to a fixed dominating measure. We provide tight bounds on the minimax regret of learning a nonparametric function class, with nearly optimal dependence on both the horizon and smoothness parameters. Furthermore, we provide the first oracle-efficient, no-regret algorithms in this setting. In particular, we propose an oracle-efficient improper algorithm whose regret achieves optimal dependence on the horizon and a proper algorithm requiring only a single oracle call per round whose regret has the optimal horizon dependence in the classification setting and is sublinear in general. Both algorithms have exponentially worse dependence on the smoothness parameter of the adversary than the minimax rate. We then prove a lower bound on the oracle complexity of any proper learning algorithm, which matches the oracle-efficient upper bounds up to a polynomial factor, thus demonstrating the existence of a statistical-computational gap in smooth online learning. Finally, we apply our results to the contextual bandit setting to show that if a function class is learnable in the classical setting, then there is an oracle-efficient, no-regret algorithm for contextual bandits in the case that contexts arrive in a smooth manner.

artificial intelligence, educational setting, machine learning, (21 more...)

arXiv.org Machine Learning

2202.0469

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.48)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Intrinsic Dimension Estimation

Block, Adam, Jia, Zeyu, Polyanskiy, Yury, Rakhlin, Alexander

arXiv.org Machine LearningJun-7-2021

It has long been thought that high-dimensional data encountered in many practical machine learning tasks have low-dimensional structure, i.e., the manifold hypothesis holds. A natural question, thus, is to estimate the intrinsic dimension of a given population distribution from a finite sample. We introduce a new estimator of the intrinsic dimension and provide finite sample, non-asymptotic guarantees. We then apply our techniques to get new sample complexity bounds for Generative Adversarial Networks (GANs) depending only on the intrinsic dimension of the data.

artificial intelligence, dimension, neural network, (19 more...)

arXiv.org Machine Learning

2106.04018

Country: Europe (0.46)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Add feedback

Majorizing Measures, Sequential Complexities, and Online Learning

Block, Adam, Dagan, Yuval, Rakhlin, Sasha

arXiv.org Machine LearningFeb-2-2021

We introduce the technique of generic chaining and majorizing measures for controlling sequential Rademacher complexity. We relate majorizing measures to the notion of fractional covering numbers, which we show to be dominated in terms of sequential scale-sensitive dimensions in a horizon-independent way, and, under additional complexity assumptions establish a tight control on worst-case sequential Rademacher complexity in terms of the integral of sequential scale-sensitive dimension. Finally, we establish a tight contraction inequality for worst-case sequential Rademacher complexity. The above constitutes the resolution of a number of outstanding open problems in extending the classical theory of empirical processes to the sequential case, and, in turn, establishes sharp results for online learning.

complexity, computer based training, educational technology, (20 more...)

arXiv.org Machine Learning

2102.01729

Genre: Research Report (0.64)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.61)

Add feedback

Fast Mixing of Multi-Scale Langevin Dynamics under the Manifold Hypothesis

Block, Adam, Mroueh, Youssef, Rakhlin, Alexander, Ross, Jerret

arXiv.org Machine LearningJun-22-2020

Recently, the task of image generation has attracted much attention. In particular, the recent empirical successes of the Markov Chain Monte Carlo (MCMC) technique of Langevin Dynamics have prompted a number of theoretical advances; despite this, several outstanding problems remain. First, the Langevin Dynamics is run in very high dimension on a nonconvex landscape; in the worst case, due to the NP-hardness of nonconvex optimization, it is thought that Langevin Dynamics mixes only in time exponential in the dimension. In this work, we demonstrate how the manifold hypothesis allows for the considerable reduction of mixing time, from exponential in the ambient dimension to depending only on the (much smaller) intrinsic dimension of the data. Second, the high dimension of the sampling space significantly hurts the performance of Langevin Dynamics; we leverage a multi-scale approach to help ameliorate this issue and observe that this multi-resolution algorithm allows for a trade-off between image quality and computational expense in generation.

artificial intelligence, langevin, neural network, (18 more...)

arXiv.org Machine Learning

2006.11166

Country:

Europe (0.67)
North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Generative Modeling with Denoising Auto-Encoders and Langevin Sampling

Block, Adam, Mroueh, Youssef, Rakhlin, Alexander

arXiv.org Machine LearningJan-31-2020

We study convergence of a generative modeling method that first estimates the score function of the distribution using Denoising Auto-Encoders (DAE) or Denoising Score Matching (DSM) and then employs Langevin diffusion for sampling. We show that both DAE and DSM provide estimates of the score of the Gaussian smoothed population density, allowing us to apply the machinery of Empirical Processes. We overcome the challenge of relying only on $L^2$ bounds on the score estimation error and provide finite-sample bounds in the Wasserstein distance between the law of the population distribution and the law of this sampling scheme. We then apply our results to the homotopy method of arXiv:1907.05600 and provide theoretical justification for its empirical success.

artificial intelligence, neural network, null, (17 more...)

arXiv.org Machine Learning

2002.00107

Country: Europe > Netherlands (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback