AITopics

2104.11283

Country:

North America > United States > Florida > Alachua County > Gainesville (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Shimmura, Ryosuke, Suzuki, Joe

Converting ADMM to a Proximal Gradient for Convex Optimization Problems

arXiv.org Machine LearningApr-22-2021

In machine learning and data science, we often consider efficiency for solving problems. In sparse estimation, such as fused lasso and convex clustering, we apply either the proximal gradient method or the alternating direction method of multipliers (ADMM) to solve the problem. It takes time to include matrix division in the former case, while an efficient method such as FISTA (fast iterative shrinkage-thresholding algorithm) has been developed in the latter case. This paper proposes a general method for converting the ADMM solution to the proximal gradient method, assuming that the constraints and objectives are strongly convex. Then, we apply it to sparse estimation problems, such as sparse convex clustering and trend filtering, and we show by numerical experiments that we can obtain a significant improvement in terms of efficiency.

calculation time, convex, gradient method, (13 more...)

2104.10911

Country: Europe > France (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Tsukurimichi, Toshiaki, Inatsu, Yu, Duy, Vo Nguyen Le, Takeuchi, Ichiro

Conditional Selective Inference for Robust Regression and Outlier Detection using Piecewise-Linear Homotopy Continuation

arXiv.org Machine LearningApr-21-2021

In practical data analysis under noisy environment, it is common to first use robust methods to identify outliers, and then to conduct further analysis after removing the outliers. In this paper, we consider statistical inference of the model estimated after outliers are removed, which can be interpreted as a selective inference (SI) problem. To use conditional SI framework, it is necessary to characterize the events of how the robust method identifies outliers. Unfortunately, the existing methods cannot be directly used here because they are applicable to the case where the selection events can be represented by linear/quadratic constraints. In this paper, we propose a conditional SI method for popular robust regressions by using homotopy method. We show that the proposed conditional SI method is applicable to a wide class of robust regression and outlier detection methods and has good empirical performance on both synthetic data and real data experiments.

inference, outlier, regression, (14 more...)

2104.1084

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Experimental Study (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Schellekens, Vincent, Jacques, Laurent

Asymmetric compressive learning guarantees with applications to quantized sketches

arXiv.org Machine LearningApr-20-2021

The compressive learning framework reduces the computational cost of training on large-scale datasets. In a sketching phase, the data is first compressed to a lightweight sketch vector, obtained by mapping the data samples through a well-chosen feature map, and averaging those contributions. In a learning phase, the desired model parameters are then extracted from this sketch by solving an optimization problem, which also involves a feature map. When the feature map is identical during the sketching and learning phases, formal statistical guarantees (excess risk bounds) have been proven. However, the desirable properties of the feature map are different during sketching and learning (e.g. quantized outputs, and differentiability, respectively). We thus study the relaxation where this map is allowed to be different for each phase. First, we prove that the existing guarantees carry over to this asymmetric scheme, up to a controlled error term, provided some Limited Projected Distortion (LPD) property holds. We then instantiate this framework to the setting of quantized sketches, by proving that the LPD indeed holds for binary sketch contributions. Finally, we further validate the approach with numerical simulations, including a large-scale application in audio event classification.

excess risk, rff, sketch, (16 more...)

2104.10061

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Belgium > Wallonia > Walloon Brabant > Louvain-la-Neuve (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

arXiv.org Artificial IntelligenceApr-20-2021

Bayesian Optimization is Superior to Random Search for Machine Learning Hyperparameter Tuning: Analysis of the Black-Box Optimization Challenge 2020

Turner, Ryan, Eriksson, David, McCourt, Michael, Kiili, Juha, Laaksonen, Eero, Xu, Zhen, Guyon, Isabelle

This paper presents the results and insights from the black-box optimization (BBO) challenge at NeurIPS 2020 which ran from July-October, 2020. The challenge emphasized the importance of evaluating derivative-free optimizers for tuning the hyperparameters of machine learning models. This was the first black-box optimization challenge with a machine learning emphasis. It was based on tuning (validation set) performance of standard machine learning models on real datasets. This competition has widespread impact as black-box optimization (e.g., Bayesian optimization) is relevant for hyperparameter tuning in almost every machine learning project as well as many applications outside of machine learning. The final leaderboard was determined using the optimization performance on held-out (hidden) objective functions, where the optimizers ran without human intervention. Baselines were set using the default settings of several open-source black-box optimization packages as well as random search.

algorithm, optimization, participant, (16 more...)

2104.10201

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Île-de-France (0.04)
Asia > China (0.04)

Genre:

Research Report (0.82)
Instructional Material > Course Syllabus & Notes (0.68)

Industry: Transportation > Air (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Artificial IntelligenceApr-20-2021

Demystify Optimization Challenges in Multilingual Transformers

Li, Xian, Gong, Hongyu

Multilingual Transformer improves parameter efficiency and crosslingual transfer. How to effectively train multilingual models has not been well studied. Using multilingual machine translation as a testbed, we study optimization challenges from loss landscape and parameter plasticity perspectives. We found that imbalanced training data poses task interference between high and low resource languages, characterized by nearly orthogonal gradients for major parameters and the optimization trajectory being mostly dominated by high resource. We show that local curvature of the loss surface affects the degree of interference, and existing heuristics of data subsampling implicitly reduces the sharpness, although still face a trade-off between high and low resource languages. We propose a principled multi-objective optimization algorithm, Curvature Aware Task Scaling (CATS), which improves both optimization and generalization especially for low resource. Experiments on TED, WMT and OPUS-100 benchmarks demonstrate that CATS advances the Pareto front of accuracy while being efficient to apply to massive multilingual settings at the scale of 100 languages.

arxiv preprint arxiv, demystify optimization challenge, translation, (12 more...)

2104.07639

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceApr-19-2021

Randomized Algorithms for Scientific Computing (RASC)

Buluc, Aydin, Kolda, Tamara G., Wild, Stefan M., Anitescu, Mihai, DeGennaro, Anthony, Jakeman, John, Kamath, Chandrika, Ramakrishnan, null, Kannan, null, Lopes, Miles E., Martinsson, Per-Gunnar, Myers, Kary, Nelson, Jelani, Restrepo, Juan M., Seshadhri, C., Vrabie, Draguna, Wohlberg, Brendt, Wright, Stephen J., Yang, Chao, Zwart, Peter

Randomized algorithms have propelled advances in artificial intelligence and represent a foundational research area in advancing AI for Science. Future advancements in DOE Office of Science priority areas such as climate science, astrophysics, fusion, advanced materials, combustion, and quantum computing all require randomized algorithms for surmounting challenges of complexity, robustness, and scalability. This report summarizes the outcomes of that workshop, "Randomized Algorithms for Scientific Computing (RASC)," held virtually across four days in December 2020 and January 2021.

algorithm, game theory, renewable energy, (29 more...)

2104.11079

Country:

North America > United States > Colorado (0.28)
North America > Canada (0.27)
North America > United States > New Mexico (0.15)
(14 more...)

Genre:

Instructional Material > Course Syllabus & Notes (1.00)
Overview (0.92)
Research Report > Promising Solution (0.67)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Energy > Renewable (1.00)
(4 more...)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Software (1.00)
Information Technology > Sensing and Signal Processing (1.00)
(13 more...)

Neiswanger, Willie, Wang, Ke Alexander, Ermon, Stefano

Bayesian Algorithm Execution: Estimating Computable Properties of Black-box Functions Using Mutual Information

arXiv.org Artificial IntelligenceApr-19-2021

In many real world problems, we want to infer some property of an expensive black-box function f, given a budget of T function evaluations. One example is budget constrained global optimization of f, for which Bayesian optimization is a popular method. Other properties of interest include local optima, level sets, integrals, or graph-structured information induced by f. Often, we can find an algorithm A to compute the desired property, but it may require far more than T queries to execute. Given such an A, and a prior distribution over f, we refer to the problem of inferring the output of A using T evaluations as Bayesian Algorithm Execution (BAX). To tackle this problem, we present a procedure, InfoBAX, that sequentially chooses queries that maximize mutual information with respect to the algorithm's output. Applying this to Dijkstra's algorithm, for instance, we infer shortest paths in synthetic and real-world graphs with black-box edge costs. Using evolution strategies, we yield variants of Bayesian optimization that target local, rather than global, optima. On these problems, InfoBAX uses up to 500 times fewer queries to f than required by the original algorithm. Our method is closely connected to other Bayesian optimal experimental design procedures such as entropy search methods and optimal sensor placement using Gaussian processes.

acquisition function, algorithm, optimization, (16 more...)

2104.0946

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.84)

Industry: Transportation > Air (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningApr-18-2021

Complexity Lower Bounds for Nonconvex-Strongly-Concave Min-Max Optimization

Li, Haochuan, Tian, Yi, Zhang, Jingzhao, Jadbabaie, Ali

We provide a first-order oracle complexity lower bound for finding stationary points of min-max optimization problems where the objective function is smooth, nonconvex in the minimization variable, and strongly concave in the maximization variable. We establish a lower bound of $\Omega\left(\sqrt{\kappa}\epsilon^{-2}\right)$ for deterministic oracles, where $\epsilon$ defines the level of approximate stationarity and $\kappa$ is the condition number. Our analysis shows that the upper bound achieved in (Lin et al., 2020b) is optimal in the $\epsilon$ and $\kappa$ dependence up to logarithmic factors. For stochastic oracles, we provide a lower bound of $\Omega\left(\sqrt{\kappa}\epsilon^{-2} + \kappa^{1/3}\epsilon^{-4}\right)$. It suggests that there is a significant gap between the upper bound $\mathcal{O}(\kappa^3 \epsilon^{-4})$ in (Lin et al., 2020a) and our lower bound in the condition number dependence.

algorithm, arxiv preprint arxiv, optimization, (14 more...)

2104.08708

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

arXiv.org Artificial IntelligenceApr-18-2021

A Rank based Adaptive Mutation in Genetic Algorithm

Basak, Avijit

Traditionally Genetic Algorithm has been used for optimization of unimodal and multimodal functions. Earlier researchers worked with constant probabilities of GA control operators like crossover, mutation etc. for tuning the optimization in specific domains. Recent advancements in this field witnessed adaptive approach in probability determination. In Adaptive mutation primarily poor individuals are utilized to explore state space, so mutation probability is usually generated proportionally to the difference between fitness of best chromosome and itself (fMAX - f). However, this approach is susceptible to nature of fitness distribution during optimization. This paper presents an alternate approach of mutation probability generation using chromosome rank to avoid any susceptibility to fitness distribution. Experiments are done to compare results of simple genetic algorithm (SGA) with constant mutation probability and adaptive approaches within a limited resource constraint for unimodal, multimodal functions and Travelling Salesman Problem (TSP). Measurements are done for average best fitness, number of generations evolved and percentage of global optimum achievements out of several trials. The results demonstrate that the rank-based adaptive mutation approach is superior to fitness-based adaptive approach as well as SGA in a multimodal problem space.

fitness, mutation probability, probability, (9 more...)

doi: 10.5120/ijca2020920572

2104.08842

Country:

North America > United States > Michigan (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Maryland (0.04)
(4 more...)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.90)