AITopics

1906.06032

Country: North America > United States > California (0.28)

Genre: Research Report (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningJun-10-2019

Unlabeled Data Improves Adversarial Robustness

Carmon, Yair, Raghunathan, Aditi, Schmidt, Ludwig, Liang, Percy, Duchi, John C.

The past few years have seen an intense research interest in making models robust to adversarial examples [37]. Yet despite a wide range of proposed defenses, the state-of-the-art in adversarial robustness is far from satisfactory. Recent work points towards sample complexity as a possible reason for the small gains in robustness: Schmidt et al. [35] show that in a simple model, learning a classifier with nontrivial adversarially robust accuracy requires substantially more samples than achieving good "standard" accuracy. Furthermore, recent empirical work obtains promising gains in robustness via transfer learning of a robust classifier from a larger labeled dataset [15]. While both theory and experiments suggest that more training data leads to greater robustness, following this suggestion can be difficult due to the cost of gathering additional data and especially obtaining high-quality labels.

accuracy, deep learning, neural network, (20 more...)

1905.13736

Country: North America > Canada > Ontario > Toronto (0.14)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Machine LearningMar-20-2019

The importance of better models in stochastic optimization

Asi, Hilal, Duchi, John C.

Standard stochastic optimization methods are brittle, sensitive to stepsize choices and other algorithmic parameters, and they exhibit instability outside of well-behaved families of objectives. To address these challenges, we investigate models for stochastic minimization and learning problems that exhibit better robustness to problem families and algorithmic parameters. With appropriately accurate models---which we call the aProx family---stochastic methods can be made stable, provably convergent and asymptotically optimal; even modeling that the objective is nonnegative is sufficient for this stability. We extend these results beyond convexity to weakly convex objectives, which include compositions of convex losses with smooth functions common in modern machine learning applications. We highlight the importance of robustness and accurate modeling with a careful experimental evaluation of convergence time and algorithm sensitivity.

convergence, deep learning, neural network, (20 more...)

1903.08619

Country:

North America > United States > California (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningMar-6-2019

A Rank-1 Sketch for Matrix Multiplicative Weights

Carmon, Yair, Duchi, John C., Sidford, Aaron, Tian, Kevin

We show that a simple randomized sketch of the matrix multiplicative weight (MMW) update enjoys the same regret bounds as MMW, up to a small constant factor. Unlike MMW, where every step requires full matrix exponentiation, our steps require only a single product of the form $e^A b$, which the Lanczos method approximates efficiently. Our key technique is to view the sketch as a randomized mirror projection, and perform mirror descent analysis on the expected projection. Our sketch solves the online eigenvector problem, improving the best known complexity bounds. We also apply this sketch to a simple no-regret scheme for semidefinite programming in saddle-point form, where it matches the best known guarantees.

artificial intelligence, cosh, machine learning, (19 more...)

1903.02675

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

arXiv.org Machine LearningJan-10-2019

Mean Estimation from One-Bit Measurements

Kipnis, Alon, Duchi, John C.

We consider the problem of estimating the mean of a symmetric log-concave distribution under the following constraint: only a single bit per sample from this distribution is available to the estimator. We study the mean squared error (MSE) risk in this estimation as a function of the number of samples, and hence the number of bits, from this distribution. Under an adaptive setting in which each bit is a function of the current sample and the previously observed bits, we show that the optimal relative efficiency compared to the sample mean is the efficiency of the median. For example, in estimating the mean of a normal distribution, a constraint of one bit per sample incurs a penalty of $\pi/2$ in sample size compared to the unconstrained case. We also consider a distributed setting where each one-bit message is only a function of a single sample. We derive lower bounds on the MSE in this setting, and show that the optimal efficiency can only be attained at a finite number of points in the parameter space. Finally, we analyze a distributed setting where the bits are obtained by comparing each sample against a prescribed threshold. Consequently, we consider the threshold density that minimizes the maximal MSE. Our results indicate that estimating the mean from one-bit measurements is equivalent to estimating the sample median from these measurements. In the adaptive case, this estimate can be done with vanishing error for any point in the parameter space. In the distributed case, this estimate can be done with vanishing error only for a finite number of possible values for the unknown mean.

artificial intelligence, estimator, machine learning, (17 more...)

1901.03403

Country: North America > United States (0.28)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.93)

Generalizing to Unseen Domains via Adversarial Data Augmentation

Volpi, Riccardo, Namkoong, Hongseok, Sener, Ozan, Duchi, John C., Murino, Vittorio, Savarese, Silvio

We are concerned with learning models that generalize well to different unseen domains. We consider a worst-case formulation over data distributions that are near the source domain in the feature space. Only using training data from a single source distribution, we propose an iterative procedure that augments the dataset with examples from a fictitious target domain that is "hard" under the current model. We show that our iterative scheme is an adaptive data augmentation method where we append adversarial examples at each iteration. For softmax losses, we show that our method is a data-dependent regularization scheme that behaves differently from classical regularizers that regularize towards zero (e.g., ridge or lasso). On digit recognition and semantic segmentation tasks, our method learns models improve performance across a range of a priori unknown target domains.

artificial intelligence, neural network, target distribution, (17 more...)

Country: North America > United States (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Analysis of Krylov Subspace Solutions of Regularized Non-Convex Quadratic Problems

Carmon, Yair, Duchi, John C.

We provide convergence rates for Krylov subspace solutions to the trust-region and cubic-regularized (nonconvex) quadratic problems. Such solutions may be efficiently computed by the Lanczos method and have long been used in practice. We prove error bounds of the form $1/t^2$ and $e^{-4t/\sqrt{\kappa}}$, where $\kappa$ is a condition number for the problem, and $t$ is the Krylov subspace order (number of Lanczos iterations). We also provide lower bounds showing that our analysis is sharp.

artificial intelligence, machine learning, optimization, (16 more...)

Country:

Asia > Middle East (0.14)
North America > United States (0.14)
North America > Canada (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Scalable End-to-End Autonomous Vehicle Testing via Rare-event Simulation

O', Kelly, Matthew, Sinha, Aman, Namkoong, Hongseok, Tedrake, Russ, Duchi, John C.

While recent developments in autonomous vehicle (AV) technology highlight substantial progress, we lack tools for rigorous and scalable testing. Real-world testing, the de facto evaluation environment, places the public in danger, and, due to the rare nature of accidents, will require billions of miles in order to statistically validate performance claims. We implement a simulation framework that can test an entire modern autonomous driving system, including, in particular, systems that employ deep-learning perception and control algorithms. Using adaptive importance-sampling methods to accelerate rare-event probability evaluation, we estimate the probability of an accident under a base distribution governing standard traffic behavior. We demonstrate our framework on a highway scenario, accelerating system evaluation by 2-20 times over naive Monte Carlo sampling methods and 10-300P times (where P is the number of processors) over real-world testing.

artificial intelligence, cross-entropy method, machine learning, (18 more...)

Country:

North America > United States > California (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report (0.67)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Scalable End-to-End Autonomous Vehicle Testing via Rare-event Simulation

O', Kelly, Matthew, Sinha, Aman, Namkoong, Hongseok, Tedrake, Russ, Duchi, John C.

While recent developments in autonomous vehicle (AV) technology highlight substantial progress, we lack tools for rigorous and scalable testing. Real-world testing, the de facto evaluation environment, places the public in danger, and, due to the rare nature of accidents, will require billions of miles in order to statistically validate performance claims. We implement a simulation framework that can test an entire modern autonomous driving system, including, in particular, systems that employ deep-learning perception and control algorithms. Using adaptive importance-sampling methods to accelerate rare-event probability evaluation, we estimate the probability of an accident under a base distribution governing standard traffic behavior. We demonstrate our framework on a highway scenario, accelerating system evaluation by 2-20 times over naive Monte Carlo sampling methods and 10-300P times (where P is the number of processors) over real-world testing.

cross-entropy method, deep learning, neural network, (23 more...)

Country:

North America > United States > California (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report (0.67)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Analysis of Krylov Subspace Solutions of Regularized Non-Convex Quadratic Problems

Carmon, Yair, Duchi, John C.

We provide convergence rates for Krylov subspace solutions to the trust-region and cubic-regularized (nonconvex) quadratic problems. Such solutions may be efficiently computed by the Lanczos method and have long been used in practice. We prove error bounds of the form $1/t^2$ and $e^{-4t/\sqrt{\kappa}}$, where $\kappa$ is a condition number for the problem, and $t$ is the Krylov subspace order (number of Lanczos iterations). We also provide lower bounds showing that our analysis is sharp.

artificial intelligence, machine learning, optimization, (18 more...)