AITopics

2008.03707

Country:

North America > United States > New York (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry:

Energy > Renewable > Wind (0.88)
Energy > Energy Storage (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.85)

Huang, Feihu, Chen, Songcan, Huang, Heng

Faster Stochastic Alternating Direction Method of Multipliers for Nonconvex Optimization

arXiv.org Machine LearningAug-9-2020

In this paper, we propose a faster stochastic alternating direction method of multipliers (ADMM) for nonconvex optimization by using a new stochastic path-integrated differential estimator (SPIDER), called as SPIDER-ADMM. Moreover, we prove that the SPIDER-ADMM achieves a record-breaking incremental first-order oracle (IFO) complexity of $\mathcal{O}(n+n^{1/2}\epsilon^{-1})$ for finding an $\epsilon$-approximate stationary point, which improves the deterministic ADMM by a factor $\mathcal{O}(n^{1/2})$, where $n$ denotes the sample size. As one of major contribution of this paper, we provide a new theoretical analysis framework for nonconvex stochastic ADMM methods with providing the optimal IFO complexity. Based on this new analysis framework, we study the unsolved optimal IFO complexity of the existing non-convex SVRG-ADMM and SAGA-ADMM methods, and prove they have the optimal IFO complexity of $\mathcal{O}(n+n^{2/3}\epsilon^{-1})$. Thus, the SPIDER-ADMM improves the existing stochastic ADMM methods by a factor of $\mathcal{O}(n^{1/6})$. Moreover, we extend SPIDER-ADMM to the online setting, and propose a faster online SPIDER-ADMM. Our theoretical analysis shows that the online SPIDER-ADMM has the IFO complexity of $\mathcal{O}(\epsilon^{-\frac{3}{2}})$, which improves the existing best results by a factor of $\mathcal{O}(\epsilon^{-\frac{1}{2}})$. Finally, the experimental results on benchmark datasets validate that the proposed algorithms have faster convergence rate than the existing ADMM algorithms for nonconvex optimization.

artificial intelligence, faster stochastic admm, machine learning, (13 more...)

2008.01296

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(2 more...)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

#artificialintelligenceAug-8-2020, 18:46:21 GMT

Bayesian Optimization for Selecting Efficient Machine Learning Models

The performance of many machine learning models depends on their hyper-parameter settings. Bayesian Optimization has become a successful tool for hyper-parameter optimization of machine learning algorithms, which aims to identify optimal hyper-parameters during an iterative sequential process. However, most of the Bayesian Optimization algorithms are designed to select models for effectiveness only and ignore the important issue of model training efficiency. Given that both model effectiveness and training time are important for real-world applications, models selected for effectiveness may not meet the strict training time requirements necessary to deploy in a production environment. In this work, we present a unified Bayesian Optimization framework for jointly optimizing models for both prediction effectiveness and training efficiency. We propose an objective that captures the tradeoff between these two metrics and demonstrate how we can jointly optimize them in a principled Bayesian Optimization framework.

artificial intelligence, bayesian optimization, machine learning, (8 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.95)

Bravo-Ferreira, Jose F. S., Cowburn, David, Khoo, Yuehaw, Singer, Amit

NMR Assignment through Linear Programming

arXiv.org Artificial IntelligenceAug-8-2020

Nuclear Magnetic Resonance (NMR) Spectroscopy is the second most used technique (after X-ray crystallography) for structural determination of proteins. A computational challenge in this technique involves solving a discrete optimization problem that assigns the resonance frequency to each atom in the protein. We present a novel linear programming formulation of the problem which gives state-of-the-art results in simulated and experimental datasets.

artificial intelligence, assignment, optimization problem, (18 more...)

2008.03641

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Martinez, Aritz D., Del Ser, Javier, Villar-Rodriguez, Esther, Osaba, Eneko, Poyatos, Javier, Tabik, Siham, Molina, Daniel, Herrera, Francisco

Lights and Shadows in Evolutionary Deep Learning: Taxonomy, Critical Methodological Analysis, Cases of Study, Learned Lessons, Recommendations and Challenges

arXiv.org Artificial IntelligenceAug-8-2020

Much has been said about the fusion of bio-inspired optimization algorithms and Deep Learning models for several purposes: from the discovery of network topologies and hyper-parametric configurations with improved performance for a given task, to the optimization of the model's parameters as a replacement for gradient-based solvers. Indeed, the literature is rich in proposals showcasing the application of assorted nature-inspired approaches for these tasks. In this work we comprehensively review and critically examine contributions made so far based on three axes, each addressing a fundamental question in this research avenue: a) optimization and taxonomy (Why?), including a historical perspective, definitions of optimization problems in Deep Learning, and a taxonomy associated with an in-depth analysis of the literature, b) critical methodological analysis (How?), which together with two case studies, allows us to address learned lessons and recommendations for good practices following the analysis of the literature, and c) challenges and new directions of research (What can be done, and what for?). In summary, three axes - optimization and taxonomy, critical analysis, and challenges - which outline a complete vision of a merger of two technologies drawing up an exciting future for this area of fusion research.

artificial intelligence, machine learning, optimization, (15 more...)

2008.0362

Country:

Asia > Singapore (0.04)
North America > United States > Michigan (0.04)
South America > Argentina > Patagonia > Río Negro Province > Viedma (0.04)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.92)

Industry:

Information Technology (1.00)
Health & Medicine (1.00)
Leisure & Entertainment > Games > Computer Games (0.92)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Karimireddy, Sai Praneeth, Jaggi, Martin, Kale, Satyen, Mohri, Mehryar, Reddi, Sashank J., Stich, Sebastian U., Suresh, Ananda Theertha

Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning

arXiv.org Machine LearningAug-8-2020

Federated learning is a challenging optimization problem due to the heterogeneity of the data across different clients. Such heterogeneity has been observed to induce client drift and significantly degrade the performance of algorithms designed for this setting. In contrast, centralized learning with centrally collected data does not experience such drift, and has seen great empirical and theoretical progress with innovations such as momentum, adaptivity, etc. In this work, we propose a general framework Mime which mitigates client-drift and adapts arbitrary centralized optimization algorithms (e.g.\ SGD, Adam, etc.) to federated learning. Mime uses a combination of control-variates and server-level statistics (e.g. momentum) at every client-update step to ensure that each local update mimics that of the centralized method. Our thorough theoretical and empirical analyses strongly establish Mime's superiority over other baselines.

artificial intelligence, machine learning, momentum, (15 more...)

2008.03606

Country: North America > United States > Virginia (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Zhang, Xingwen, Yang, Shuang

Learning (Re-)Starting Solutions for Vehicle Routing Problems

arXiv.org Artificial IntelligenceAug-7-2020

A key challenge in solving a combinatorial optimization problem is how to guide the agent (i.e., solver) to efficiently explore the enormous search space. Conventional approaches often rely on enumeration (e.g., exhaustive, random, or tabu search) or have to restrict the exploration to rather limited regions (e.g., a single path as in iterative algorithms). In this paper, we show it is possible to use machine learning to speedup the exploration. In particular, a value network is trained to evaluate solution candidates, which provides a useful structure (i.e., an approximate value surface) over the search space; this value network is then used to screen solutions to help a black-box optimization agent to initialize or restart so as to navigate through the search space towards desirable solutions. Experiments demonstrate that the proposed ``Learn to Restart'' algorithm achieves promising results in solving Capacitated Vehicle Routing Problems (CVRPs).

artificial intelligence, initial solution, machine learning, (17 more...)

2008.03424

Country: North America > United States > California > Santa Clara County > Sunnyvale (0.05)

Genre: Research Report (0.83)

Industry: Transportation > Freight & Logistics Services (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Sando, Keishi, Hino, Hideitsu

Modal Principal Component Analysis

arXiv.org Machine LearningAug-7-2020

Principal component analysis (PCA; Jolliffe (2002)) is one of the most popular methods used to find a low-dimensional subspace in which a given dataset lies. Classical PCA (cPCA) can be formulated as a problem to find a subspace that minimizes the sum of squared residuals, but squared residuals make PCA vulnerable to outliers. A lot of PCA algorithms have been proposed to robustify cPCA. The R1-PCA proposed by Ding et al. (2006) replaced the sum of squared residuals in cPCA with the sum of unsquared ones. The optimal solution of R1-PCA has similar properties to those of cPCA, that is, it is given as the eigenvectors of the weighted covariance matrix and it is rotationally invariant. The absolute residuals can reduce negative impact of outliers, but an arbitrary large outlier can still break down the estimate. More recently, Zhang and Lerman (2014) and Lerman et al. (2015) relaxed the optimization problem so that the set of projection matrices is extended to a set of convex set of matrices, and 1

artificial intelligence, machine learning, outlier, (17 more...)

2008.034

Country:

North America > United States > Wisconsin (0.04)
North America > United States > New York (0.04)
Europe > Italy (0.04)
Asia > Japan > Honshū > Kantō > Ibaraki Prefecture > Tsukuba (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Yang, Li, Shami, Abdallah

On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice

arXiv.org Machine LearningAug-7-2020

Machine learning algorithms have been used widely in various applications and areas. To fit a machine learning model into different problems, its hyper-parameters must be tuned. Selecting the best hyper-parameter configuration for machine learning models has a direct impact on the model's performance. It often requires deep knowledge of machine learning algorithms and appropriate hyper-parameter optimization techniques. Although several automatic optimization techniques exist, they have different strengths and drawbacks when applied to different types of problems. In this paper, optimizing the hyper-parameters of common machine learning models is studied. We introduce several state-of-the-art optimization techniques and discuss how to apply them to machine learning algorithms. Many available libraries and frameworks developed for hyper-parameter optimization problems are provided, and some open challenges of hyper-parameter optimization research are also discussed in this paper. Moreover, experiments are conducted on benchmark datasets to compare the performance of different optimization methods and provide practical examples of hyper-parameter optimization. This survey paper will help industrial users, data analysts, and researchers to better develop machine learning models by identifying the proper hyper-parameter configurations effectively.

artificial intelligence, bayesian inference, machine learning, (20 more...)

doi: 10.1016/j.neucom.2020.07.061

2007.15745

Country:

North America > Canada > Ontario > Middlesex County > London (0.14)
Asia > Middle East > Jordan (0.04)
Asia > China > Hubei Province > Wuhan (0.04)
(5 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry:

Education (0.92)
Information Technology > Security & Privacy (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(4 more...)

#artificialintelligenceAug-6-2020, 08:06:44 GMT

Hyperparameter Tuning

Probabilistic models are estimated by unknown quantities, called parameters. These are adjusted using an optimization technique so that in the training sample it is possible to find a pattern in the best possible way. In a simple way, parameters are estimated by the algorithm and the user has little / nothing control over them. In a simple linear regression, the model parameters are betas (ẞ). BAM!!!: In statistics jargon, parameters are defined as population characteristics.

artificial intelligence, machine learning, optimization problem, (8 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.37)