AITopics | Optimization

Collaborating Authors

Optimization

News Overviews Instructional Materials AI-Alerts Classics

Towards Fast Computation of Certified Robustness for ReLU Networks

Weng, Tsui-Wei, Zhang, Huan, Chen, Hongge, Song, Zhao, Hsieh, Cho-Jui, Boning, Duane, Dhillon, Inderjit S., Daniel, Luca

arXiv.org Machine LearningApr-25-2018

Verifying the robustness property of a general Rectified Linear Unit (ReLU) network is an NP-complete problem [Katz, Barrett, Dill, Julian and Kochenderfer CAV17]. Although finding the exact minimum adversarial distortion is hard, giving a certified lower bound of the minimum distortion is possible. Current available methods of computing such a bound are either time-consuming or delivering low quality bounds that are too loose to be useful. In this paper, we exploit the special structure of ReLU networks and provide two computationally efficient algorithms (Fast-Lin and Fast-Lip) that are able to certify non-trivial lower bounds of minimum distortions, by bounding the ReLU units with appropriate linear functions (Fast-Lin), or by bounding the local Lipschitz constant (Fast-Lip). Experiments show that (1) our proposed methods deliver bounds close to (the gap is 2-3X) exact minimum distortion found by Reluplex in small MNIST networks while our algorithms are more than 10,000 times faster; (2) our methods deliver similar quality of bounds (the gap is within 35% and usually around 10%; sometimes our bounds are even better) for larger networks compared to the methods based on solving linear programming problems but our algorithms are 33-14,000 times faster; (3) our method is capable of solving large MNIST and CIFAR networks up to 7 layers with more than 10,000 neurons within tens of seconds on a single CPU core. In addition, we show that, in fact, there is no polynomial time algorithm that can approximately find the minimum $\ell_1$ adversarial distortion of a ReLU network with a $0.99\ln n$ approximation ratio unless $\mathsf{NP}$=$\mathsf{P}$, where $n$ is the number of neurons in the network.

artificial intelligence, distortion, machine learning, (19 more...)

arXiv.org Machine Learning

1804.09699

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

On The Complexity of Sparse Label Propagation

Jung, Alexander

arXiv.org Machine LearningApr-25-2018

This paper investigates the computational complexity of sparse label propagation which has been proposed recently for processing network structured data. Sparse label propagation amounts to a convex optimization problem and might be considered as an extension of basis pursuit from sparse vectors to network structured datasets. Using a standard first-order oracle model, we characterize the number of iterations for sparse label propagation to achieve a prescribed accuracy. In particular, we derive an upper bound on the number of iterations required to achieve a certain accuracy and show that this upper bound is sharp for datasets having a chain structure (e.g., time series).

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Machine Learning

1804.09597

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.36)

Add feedback

Multi-objective Architecture Search for CNNs

Elsken, Thomas, Metzen, Jan Hendrik, Hutter, Frank

arXiv.org Machine LearningApr-24-2018

Architecture search aims at automatically finding neural architectures that are competitive with architectures designed by human experts. While recent approaches have come close to matching the predictive performance of manually designed architectures for image recognition, these approaches are problematic under constrained resources for two reasons: first, the architecture search itself requires vast computational resources for most proposed methods. Secondly, the found neural architectures are solely optimized for high predictive performance without penalizing excessive resource consumption. We address the first shortcoming by proposing NASH, an architecture search which considerable reduces the computational resources required for training novel architectures by applying network morphisms and aggressive learning rate schedules. On CIFAR10, NASH finds architectures with errors below 4% in only 3 days. We address the second shortcoming by proposing Pareto-NASH, a method for multi-objective architecture search that allows approximating the Pareto-front of architectures under multiple objective, such as predictive performance and number of parameters, in a single run of the method. Within 56 GPU days of architecture search, Pareto-NASH finds a model with 4M parameters and test error of 3.5%, as well as a model with less than 1M parameters and test error of 4.6%.

artificial intelligence, evolutionary algorithm, machine learning, (16 more...)

arXiv.org Machine Learning

1804.09081

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Stochastic Conditional Gradient Methods: From Convex Minimization to Submodular Maximization

Mokhtari, Aryan, Hassani, Hamed, Karbasi, Amin

arXiv.org Machine LearningApr-24-2018

This paper considers stochastic optimization problems for a large class of objective functions, including convex and continuous submodular. Stochastic proximal gradient methods have been widely used to solve such problems; however, their applicability remains limited when the problem dimension is large and the projection onto a convex set is costly. Instead, stochastic conditional gradient methods are proposed as an alternative solution relying on (i) Approximating gradients via a simple averaging technique requiring a single stochastic gradient evaluation per iteration; (ii) Solving a linear program to compute the descent/ascent direction. The averaging technique reduces the noise of gradient approximations as time progresses, and replacing projection step in proximal methods by a linear program lowers the computational complexity of each iteration. We show that under convexity and smoothness assumptions, our proposed method converges to the optimal objective function value at a sublinear rate of $O(1/t^{1/3})$. Further, for a monotone and continuous DR-submodular function and subject to a general convex body constraint, we prove that our proposed method achieves a $((1-1/e)OPT-\eps)$ guarantee with $O(1/\eps^3)$ stochastic gradient computations. This guarantee matches the known hardness results and closes the gap between deterministic and stochastic continuous submodular maximization. Additionally, we obtain $((1/e)OPT -\eps)$ guarantee after using $O(1/\eps^3)$ stochastic gradients for the case that the objective function is continuous DR-submodular but non-monotone and the constraint set is down-closed. By using stochastic continuous optimization as an interface, we provide the first $(1-1/e)$ tight approximation guarantee for maximizing a monotone but stochastic submodular set function subject to a matroid constraint and $(1/e)$ approximation guarantee for the non-monotone case.

artificial intelligence, machine learning, optimization problem, (14 more...)

arXiv.org Machine Learning

1804.09554

Country:

Europe (1.00)
North America > United States > California (0.92)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Between hard and soft thresholding: optimal iterative thresholding algorithms

Liu, Haoyang, Barber, Rina Foygel

arXiv.org Machine LearningApr-24-2018

Iterative thresholding algorithms seek to optimize a differentiable objective function over a sparsity or rank constraint by alternating between gradient steps that reduce the objective, and thresholding steps that enforce the constraint. This work examines the choice of the thresholding operator, and asks whether it is possible to achieve stronger guarantees than what is possible with hard thresholding. We develop the notion of relative concavity of a thresholding operator, a quantity that characterizes the convergence performance of any thresholding operator on the target optimization problem. Surprisingly, we find that commonly used thresholding operators, such as hard thresholding and soft thresholding, are suboptimal in terms of convergence guarantees. Instead, a general class of thresholding operators, lying between hard thresholding and soft thresholding, is shown to be optimal with the strongest possible convergence guarantee among all thresholding operators. Examples of this general class includes $\ell_q$ thresholding with appropriate choices of $q$, and a newly defined {\em reciprocal thresholding} operator. As a byproduct of the improved convergence guarantee, these new thresholding operators improve on the best known upper bound for prediction error of both iterative hard thresholding and Lasso in terms of the dependence on condition number in the setting of sparse linear regression.

artificial intelligence, machine learning, relative concavity, (18 more...)

arXiv.org Machine Learning

1804.08841

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Simultaneous shot inversion for nonuniform geometries using fast data interpolation

Liu, Michelle, Kumar, Rajiv, Haber, Eldad, Aravkin, Aleksandr

arXiv.org Machine LearningApr-23-2018

Stochastic optimization is key to efficient inversion in PDE-constrained optimization. Using 'simultaneous shots', or random superposition of source terms, works very well in simple acquisition geometries where all sources see all receivers, but this rarely occurs in practice. We develop an approach that interpolates data to an ideal acquisition geometry while solving the inverse problem using simultaneous shots. The approach is formulated as a joint inverse problem, combining ideas from low-rank interpolation with full-waveform inversion. Results using synthetic experiments illustrate the flexibility and efficiency of the approach.

inversion, optimization problem, upstream oil & gas, (19 more...)

arXiv.org Machine Learning

1804.08697

Country:

North America > United States (0.68)
North America > Canada > British Columbia (0.15)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.35)

Add feedback

Optimality of Approximate Inference Algorithms on Stable Instances

Lang, Hunter, Sontag, David, Vijayaraghavan, Aravindan

arXiv.org Artificial IntelligenceApr-23-2018

Approximate algorithms for structured prediction problems---such as LP relaxations and the popular alpha-expansion algorithm (Boykov et al. 2001)---typically far exceed their theoretical performance guarantees on real-world instances. These algorithms often find solutions that are very close to optimal. The goal of this paper is to partially explain the performance of alpha-expansion and an LP relaxation algorithm on MAP inference in Ferromagnetic Potts models (FPMs). Our main results give stability conditions under which these two algorithms provably recover the optimal MAP solution. These theoretical results complement numerous empirical observations of good performance.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

1711.02195

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.50)

Add feedback

Discovery of Driving Patterns by Trajectory Segmentation

Moosavi, Sobhan, Nandi, Arnab, Ramnath, Rajiv

arXiv.org Artificial IntelligenceApr-23-2018

Telematics data is becoming increasingly available due to the ubiquity of devices that collect data during drives, for different purposes, such as usage based insurance (UBI), fleet management, navigation of connected vehicles, etc. Consequently, a variety of data-analytic applications have become feasible that extract valuable insights from the data. In this paper, we address the especially challenging problem of discovering behavior-based driving patterns from only externally observable phenomena (e.g. vehicle's speed). We present a trajectory segmentation approach capable of discovering driving patterns as separate segments, based on the behavior of drivers. This segmentation approach includes a novel transformation of trajectories along with a dynamic programming approach for segmentation. We apply the segmentation approach on a real-word, rich dataset of personal car trajectories provided by a major insurance company based in Columbus, Ohio. Analysis and preliminary results show the applicability of approach for finding significant driving patterns.

artificial intelligence, machine learning, trajectory, (15 more...)

arXiv.org Artificial Intelligence

1804.08748

Country: North America > United States > Ohio > Franklin County > Columbus (0.24)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.31)

Add feedback

Social Algorithms

Yang, Xin-She

arXiv.org Artificial IntelligenceApr-22-2018

To find solutions to problems commonly used in science and engineering, algorithms are required. An algorithm is a step-by-step computational procedure or a set of rules to be followed by a computer. One of the oldest algorithms is the Euclidean algorithm for finding the greatest common divisor (gcd) of two integers such as 12345 and 125, and this algorithm was first given in detail in Euclid's Elements about 2300 years ago (Chabert 1999). Modern computing involves a large set of different algorithms from fast Fourier transform (FFT) to image processing techniques and from conjugate gradient methods to finite element methods. Optimization problems in particular require specialized optimization techniques, ranging from the simple Newton-Raphson's method to more sophisticated simplex methods for linear programming. Modern trends tend to use a combination of traditional techniques in combination with contemporary stochastic metaheuristic algorithms such as genetic algorithms, firefly algorithm and particle swarm optimization.

artificial intelligence, evolutionary algorithm, machine learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-642-27737-5_678-1

1805.05855

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

Local White Matter Architecture Defines Functional Brain Dynamics

Choe, Yo Joong, Balakrishnan, Sivaraman, Singh, Aarti, Vettel, Jean M., Verstynen, Timothy

arXiv.org Machine LearningApr-22-2018

Large bundles of myelinated axons, called white matter, anatomically connect disparate brain regions together and compose the structural core of the human connectome. We recently proposed a method of measuring the local integrity along the length of each white matter fascicle, termed the local connectome. If communication efficiency is fundamentally constrained by the integrity along the entire length of a white matter bundle, then variability in the functional dynamics of brain networks should be associated with variability in the local connectome. We test this prediction using two statistical approaches that are capable of handling the high dimensionality of data. First, by performing statistical inference on distance-based correlations, we show that similarity in the local connectome between individuals is significantly correlated with similarity in their patterns of functional connectivity. Second, by employing variable selection using sparse canonical correlation analysis and cross-validation, we show that segments of the local connectome are predictive of certain patterns of functional brain dynamics. These results are consistent with the hypothesis that structural variability along axon bundles constrains communication between disparate brain regions.

artificial intelligence, correlation, machine learning, (19 more...)

arXiv.org Machine Learning

1804.08154

Country: North America > United States (0.94)

Genre:

Research Report > New Finding (0.94)
Research Report > Experimental Study (0.69)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback