AITopics | Optimization

Collaborating Authors

Optimization

News Overviews Instructional Materials AI-Alerts Classics

Stochastic Optimization of Area Under Precision-Recall Curve for Deep Learning with Provable Convergence

Qi, Qi, Luo, Youzhi, Xu, Zhao, Ji, Shuiwang, Yang, Tianbao

arXiv.org Artificial IntelligenceApr-18-2021

Areas under ROC (AUROC) and precision-recall curves (AUPRC) are common metrics for evaluating classification performance for imbalanced problems. Compared with AUROC, AUPRC is a more appropriate metric for highly imbalanced datasets. While direct optimization of AUROC has been studied extensively, optimization of AUPRC has been rarely explored. In this work, we propose a principled technical method to optimize AUPRC for deep learning. Our approach is based on maximizing the averaged precision (AP), which is an unbiased point estimator of AUPRC. We show that the surrogate loss function for AP is highly non-convex and more complicated than that of AUROC. We cast the objective into a sum of dependent compositional functions with inner functions dependent on random variables of the outer level. We propose efficient adaptive and non-adaptive stochastic algorithms with provable convergence guarantee under mild conditions by using recent advances in stochastic compositional optimization. Extensive experimental results on graphs and image datasets demonstrate that our proposed method outperforms prior methods on imbalanced problems. To the best of our knowledge, our work represents the first attempt to optimize AUPRC with provable convergence.

dataset, optimization, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2104.08736

Country:

North America > United States > Texas > Brazos County > College Station (0.14)
North America > United States > New York > New York County > New York City (0.14)
North America > United States > Iowa > Johnson County > Iowa City (0.14)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Lower Bounds on Cross-Entropy Loss in the Presence of Test-time Adversaries

Bhagoji, Arjun Nitin, Cullina, Daniel, Sehwag, Vikash, Mittal, Prateek

arXiv.org Artificial IntelligenceApr-16-2021

Understanding the fundamental limits of robust supervised learning has emerged as a problem of immense interest, from both practical and theoretical standpoints. In particular, it is critical to determine classifier-agnostic bounds on the training loss to establish when learning is possible. In this paper, we determine optimal lower bounds on the cross-entropy loss in the presence of test-time adversaries, along with the corresponding optimal classification outputs. Our formulation of the bound as a solution to an optimization problem is general enough to encompass any loss function depending on soft classifier outputs. We also propose and provide a proof of correctness for a bespoke algorithm to compute this lower bound efficiently, allowing us to determine lower bounds for multiple practical datasets of interest. We use our lower bounds as a diagnostic tool to determine the effectiveness of current robust training methods and find a gap from optimality at larger budgets. Finally, we investigate the possibility of using of optimal classification outputs as soft labels to empirically improve robust training.

adversary, cross-entropy loss, probability, (12 more...)

arXiv.org Artificial Intelligence

2104.08382

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Add feedback

Overfitting in Bayesian Optimization: an empirical study and early-stopping solution

Makarova, Anastasia, Shen, Huibin, Perrone, Valerio, Klein, Aaron, Faddoul, Jean Baptiste, Krause, Andreas, Seeger, Matthias, Archambeau, Cedric

arXiv.org Artificial IntelligenceApr-16-2021

Bayesian Optimization (BO) is a successful methodology to tune the hyperparameters of machine learning algorithms. The user defines a metric of interest, such as the validation error, and BO finds the optimal hyperparameters that minimize it. However, the metric improvements on the validation set may not translate to the test set, especially on small datasets. In other words, BO can overfit. While cross-validation mitigates this, it comes with high computational cost. In this paper, we carry out the first systematic investigation of overfitting in BO and demonstrate that this is a serious yet often overlooked concern in practice. We propose the first problem-adaptive and interpretable criterion to early stop BO, reducing overfitting while mitigating the cost of cross-validation. Experimental results on real-world hyperparameter optimization tasks show that our approach can substantially reduce compute time with little to no loss of test accuracy,demonstrating a clear practical advantage over existing techniques.

experiment, hyperparameter, variance, (13 more...)

arXiv.org Artificial Intelligence

2104.08166

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Montana (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

A Novel Surrogate-assisted Evolutionary Algorithm Applied to Partition-based Ensemble Learning

Dushatskiy, Arkadiy, Alderliesten, Tanja, Bosman, Peter A. N.

arXiv.org Artificial IntelligenceApr-16-2021

We propose a novel surrogate-assisted Evolutionary Algorithm for solving expensive combinatorial optimization problems. We integrate a surrogate model, which is used for fitness value estimation, into a state-of-the-art P3-like variant of the Gene-Pool Optimal Mixing Algorithm (GOMEA) and adapt the resulting algorithm for solving non-binary combinatorial problems. We test the proposed algorithm on an ensemble learning problem. Ensembling several models is a common Machine Learning technique to achieve better performance. We consider ensembles of several models trained on disjoint subsets of a dataset. Finding the best dataset partitioning is naturally a combinatorial non-binary optimization problem. Fitness function evaluations can be extremely expensive if complex models, such as Deep Neural Networks, are used as learners in an ensemble. Therefore, the number of fitness function evaluations is typically limited, necessitating expensive optimization techniques. In our experiments we use five classification datasets from the OpenML-CC18 benchmark and Support-vector Machines as learners in an ensemble. The proposed algorithm demonstrates better performance than alternative approaches, including Bayesian optimization algorithms. It manages to find better solutions using just several thousand fitness function evaluations for an ensemble learning problem with up to 500 variables.

algorithm, evaluation, surrogate model, (13 more...)

arXiv.org Artificial Intelligence

2104.08048

Country:

Europe > Netherlands > South Holland > Leiden (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine (1.00)
Education > Focused Education > Special Education (0.44)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
(2 more...)

Add feedback

Grouped Variable Selection with Discrete Optimization: Computational and Statistical Perspectives

Hazimeh, Hussein, Mazumder, Rahul, Radchenko, Peter

arXiv.org Machine LearningApr-14-2021

We present a new algorithmic framework for grouped variable selection that is based on discrete mathematical optimization. While there exist several appealing approaches based on convex relaxations and nonconvex heuristics, we focus on optimal solutions for the $\ell_0$-regularized formulation, a problem that is relatively unexplored due to computational challenges. Our methodology covers both high-dimensional linear regression and nonparametric sparse additive modeling with smooth components. Our algorithmic framework consists of approximate and exact algorithms. The approximate algorithms are based on coordinate descent and local search, with runtimes comparable to popular sparse learning algorithms. Our exact algorithm is based on a standalone branch-and-bound (BnB) framework, which can solve the associated mixed integer programming (MIP) problem to certified optimality. By exploiting the problem structure, our custom BnB algorithm can solve to optimality problem instances with $5 \times 10^6$ features in minutes to hours -- over $1000$ times larger than what is currently possible using state-of-the-art commercial MIP solvers. We also explore statistical properties of the $\ell_0$-based estimators. We demonstrate, theoretically and empirically, that our proposed estimators have an edge over popular group-sparse estimators in terms of statistical performance in various regimes.

algorithm, formulation, pen gr, (16 more...)

arXiv.org Machine Learning

2104.07084

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre:

Overview (0.92)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Mean-Squared Accuracy of Good-Turing Estimator

Skorski, Maciej

arXiv.org Machine LearningApr-14-2021

The brilliant method due to Good and Turing allows for estimating objects not occurring in a sample. The problem, known under names "sample coverage" or "missing mass" goes back to their cryptographic work during WWII, but over years has found has many applications, including language modeling, inference in ecology and estimation of distribution properties. This work characterizes the maximal mean-squared error of the Good-Turing estimator, for any sample \emph{and} alphabet size.

application, constraint, estimation, (14 more...)

arXiv.org Machine Learning

doi: 10.13140/RG.2.2.31351.44960/1

2104.07029

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

When Non-Elitism Meets Time-Linkage Problems

Zheng, Weijie, Zhang, Qiaozhi, Chen, Huanhuan, Yao, Xin

arXiv.org Artificial IntelligenceApr-14-2021

Many real-world applications have the time-linkage property, and the only theoretical analysis is recently given by Zheng, et al. (TEVC 2021) on their proposed time-linkage OneMax problem, OneMax$_{(0,1^n)}$. However, only two elitist algorithms (1+1)EA and ($\mu$+1)EA are analyzed, and it is unknown whether the non-elitism mechanism could help to escape the local optima existed in OneMax$_{(0,1^n)}$. In general, there are few theoretical results on the benefits of the non-elitism in evolutionary algorithms. In this work, we analyze on the influence of the non-elitism via comparing the performance of the elitist (1+$\lambda$)EA and its non-elitist counterpart (1,$\lambda$)EA. We prove that with probability $1-o(1)$ (1+$\lambda$)EA will get stuck in the local optima and cannot find the global optimum, but with probability $1$, (1,$\lambda$)EA can reach the global optimum and its expected runtime is $O(n^{3+c}\log n)$ with $\lambda=c \log_{\frac{e}{e-1}} n$ for the constant $c\ge 1$. Noting that a smaller offspring size is helpful for escaping from the local optima, we further resort to the compact genetic algorithm where only two individuals are sampled to update the probabilistic model, and prove its expected runtime of $O(n^3\log n)$. Our computational experiments also verify the efficiency of the two non-elitist algorithms.

algorithm, onemax, probability, (14 more...)

arXiv.org Artificial Intelligence

2104.06831

Country:

North America > Costa Rica > Heredia Province > Heredia (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
Europe > United Kingdom > England > West Midlands > Birmingham (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

Muesli: Combining Improvements in Policy Optimization

Hessel, Matteo, Danihelka, Ivo, Viola, Fabio, Guez, Arthur, Schmitt, Simon, Sifre, Laurent, Weber, Theophane, Silver, David, van Hasselt, Hado

arXiv.org Artificial IntelligenceApr-13-2021

We propose a novel policy update that combines regularized policy optimization with model learning as an auxiliary loss. The update (henceforth Muesli) matches MuZero's state-of-the-art performance on Atari. Notably, Muesli does so without using deep search: it acts directly with a policy network and has computation speed comparable to model-free baselines. The Atari results are complemented by extensive ablations, and by additional results on continuous control and 9x9 Go.

combining improvement, muesli, optimization, (13 more...)

arXiv.org Artificial Intelligence

2104.06159

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.30)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
(4 more...)

Add feedback

AutoOED: Automated Optimal Experiment Design Platform

Tian, Yunsheng, Luković, Mina Konaković, Erps, Timothy, Foshey, Michael, Matusik, Wojciech

arXiv.org Artificial IntelligenceApr-13-2021

We present AutoOED, an Optimal Experiment Design platform powered with automated machine learning to accelerate the discovery of optimal solutions. The platform solves multi-objective optimization problems in time- and data-efficient manner by automatically guiding the design of experiments to be evaluated. To automate the optimization process, we implement several multi-objective Bayesian optimization algorithms with state-of-the-art performance. AutoOED is open-source and written in Python. The codebase is modular, facilitating extensions and tailoring the code, serving as a testbed for machine learning researchers to easily develop and evaluate their own multi-objective Bayesian optimization algorithms. An intuitive graphical user interface (GUI) is provided to visualize and guide the experiments for users with little or no experience with coding, machine learning, or optimization. Furthermore, a distributed system is integrated to enable parallelized experimental evaluations by independent workers in remote locations. The platform is available at https://autooed.org.

autooed, experiment, optimization, (12 more...)

arXiv.org Artificial Intelligence

2104.05959

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.16)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.47)
Energy (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Data-driven Design of Context-aware Monitors for Hazard Prediction in Artificial Pancreas Systems

Zhou, Xugui, Ahmed, Bulbul, Aylor, James H., Asare, Philip, Alemzadeh, Homa

arXiv.org Artificial IntelligenceApr-13-2021

Medical Cyber-physical Systems (MCPS) are vulnerable to accidental or malicious faults that can target their controllers and cause safety hazards and harm to patients. This paper proposes a combined model and data-driven approach for designing context-aware monitors that can detect early signs of hazards and mitigate them in MCPS. We present a framework for formal specification of unsafe system context using Signal Temporal Logic (STL) combined with an optimization method for patient-specific refinement of STL formulas based on real or simulated faulty data from the closed-loop system for the generation of monitor logic. We evaluate our approach in simulation using two state-of-the-art closed-loop Artificial Pancreas Systems (APS). The results show the context-aware monitor achieves up to 1.4 times increase in average hazard prediction accuracy (F1-score) over several baseline monitors, reduces false-positive and false-negative rates, and enables hazard mitigation with a 54% success rate while decreasing the average risk for patients.

cawt monitor, controller, hazard, (16 more...)

arXiv.org Artificial Intelligence

2104.02545

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
North America > United States > Florida > Alachua County > Gainesville (0.14)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback