AITopics | Optimization

Collaborating Authors

Optimization

News Overviews Instructional Materials AI-Alerts Classics

Optimizing Ensemble Weights and Hyperparameters of Machine Learning Models for Regression Problems

Shahhosseini, Mohsen, Hu, Guiping, Pham, Hieu

arXiv.org Machine LearningAug-20-2019

Aggregating multiple learners through an ensemble of models aims to make better predictions by capturing the underlying distribution more accurately. Different ensembling methods, such as bagging, boosting and stacking/blending, have been studied and adopted extensively in research and practice. While bagging and boosting intend to reduce variance and bias, respectively, blending approaches target both by finding the optimal way to combine base learners to find the best trade-off between bias and variance. In blending, ensembles are created from weighted averages of multiple base learners. In this study, a systematic approach is proposed to find the optimal weights to create these ensembles for bias-variance tradeoff using cross-validation for regression problems (Cross-validated Optimal Weighted Ensemble (COWE)). Furthermore, it is known that tuning hyperparameters of each base learner inside the ensemble weight optimization process can produce better performing ensembles. To this end, a nested algorithm based on bi-level optimization that considers tuning hyperparameters as well as finding the optimal weights to combine ensembles (Cross-validated Optimal Weighted Ensemble with Internally Tuned Hyperparameters (COWE-ITH)) was proposed. The algorithm is shown to be generalizable to real data sets though analyses with ten publicly available data sets. The prediction accuracies of COWE-ITH and COWE have been compared to base learners and the state-of-art ensemble methods. The results show that COWE-ITH outperforms other benchmarks as well as base learners in 9 out of 10 data sets.

artificial intelligence, ensemble, machine learning, (16 more...)

arXiv.org Machine Learning

1908.05287

Country: North America > United States (0.46)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.67)

Add feedback

Inside the Mind and Methodology of a Data Scientist - Birst

#artificialintelligenceAug-19-2019, 12:35:58 GMT

When you hear about Data Science, Big Data, Analytics, Artificial Intelligence, Machine Learning, or Deep Learning, you may end up feeling a bit confused about what these terms mean. And it doesn't help reduce the confusion when every tech vendor rebrands their products as AI. So, what do these terms really mean? What are overlaps and differences? And most importantly, what can this do for your business?

algorithm, learning, machine learning, (15 more...)

#artificialintelligence

Country:

Europe > United Kingdom (0.05)
Europe > Netherlands > North Brabant > Eindhoven (0.05)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.73)

Add feedback

Towards Assessing the Impact of Bayesian Optimization's Own Hyperparameters

Lindauer, Marius, Feurer, Matthias, Eggensperger, Katharina, Biedenkapp, André, Hutter, Frank

arXiv.org Artificial IntelligenceAug-19-2019

Treating the validation loss of trained machine learning models as a black box function f, we can formulate the hyperparameter optimization problem as: x arg min x X f (x) (1) where X is space of possible configurations x . Although the community is aware of the necessity of hy-perparameter optimization (HPO) for machine learning algorithms, the impact of BO's own hyperparameters is not reported in most BO papers. On top of this, new BO approaches (and implicitly their hyperparameters) are often developed on cheap-to-evaluate artificial functions and then evaluated on real benchmarks. Although we acknowledge that this is a reasonable protocol to prevent over-engineering on the target Contact Author function family (here for example HPO benchmarks of machine learning algorithms), we believe that it is important to study whether this practice is indeed well-founded. We emphasize that this paper considers HPO on two levels as shown in Figure 1: (i) HPO of machine learning algorithms, which we consider as our target function (Target BO) and (ii) optimization of the target-BO's own choices using a meta-optimizer. In particular, we study several research questions related to the meta-optimization problem of BO's hyperparameters: 1. How large is the impact of tuning BO's own hyperpa-rameters if one was allowed to tune these on each function independently? 2. How well does the performance of an optimized configuration of the target-BO generalize to similar new functions from the same family?

artificial intelligence, machine learning, optimization, (18 more...)

arXiv.org Artificial Intelligence

1908.06674

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Computing Estimators of Dantzig Selector type via Column and Constraint Generation

Mazumder, Rahul, Wright, Stephen, Zheng, Andrew

arXiv.org Machine LearningAug-18-2019

We consider a class of linear-programming based estimators in reconstructing a sparse signal from linear measurements. Specific formulations of the reconstruction problem considered here include Dantzig selector, basis pursuit (for the case in which the measurements contain no errors), and the fused Dantzig selector (for the case in which the underlying signal is piecewise constant). In spite of being estimators central to sparse signal processing and machine learning, solving these linear programming problems for large scale instances remains a challenging task, thereby limiting their usage in practice. We show that classic constraint- and column-generation techniques from large scale linear programming, when used in conjunction with a commercial implementation of the simplex method, and initialized with the solution from a closely-related Lasso formulation, yields solutions with high efficiency in many settings.

artificial intelligence, machine learning, optimization problem, (19 more...)

arXiv.org Machine Learning

1908.06515

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Evolutionary Computation, Optimization and Learning Algorithms for Data Science

Mohammadi, Farid Ghareh, Amini, M. Hadi, Arabnia, Hamid R.

arXiv.org Machine LearningAug-16-2019

A large number of engineering, science and computational problems have yet to be solved in a computationally efficient way. One of the emerging challenges is how evolving technologies grow towards autonomy and intelligent decision making. This leads to collection of large amounts of data from various sensing and measurement technologies, e.g., cameras, smart phones, health sensors, smart electricity meters, and environment sensors. Hence, it is imperative to develop efficient algorithms for generation, analysis, classification, and illustration of data. Meanwhile, data is structured purposefully through different representations, such as large-scale networks and graphs. We focus on data science as a crucial area, specifically focusing on a curse of dimensionality (CoD) which is due to the large amount of generated/sensed/collected data. This motivates researchers to think about optimization and to apply nature-inspired algorithms, such as evolutionary algorithms (EAs) to solve optimization problems. Although these algorithms look un-deterministic, they are robust enough to reach an optimal solution. Researchers do not adopt evolutionary algorithms unless they face a problem which is suffering from placement in local optimal solution, rather than global optimal solution. In this chapter, we first develop a clear and formal definition of the CoD problem, next we focus on feature extraction techniques and categories, then we provide a general overview of meta-heuristic algorithms, its terminology, and desirable properties of evolutionary algorithms.

artificial intelligence, evolutionary algorithm, machine learning, (17 more...)

arXiv.org Machine Learning

1908.08006

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (0.68)
Transportation > Electric Vehicle (0.68)
Automobiles & Trucks (0.68)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

Federated Learning with Additional Mechanisms on Clients to Reduce Communication Costs

Yao, Xin, Huang, Tianchi, Wu, Chenglei, Zhang, Rui-xiao, Sun, Lifeng

arXiv.org Machine LearningAug-16-2019

Federated learning (FL) enables on-device training over distributed networks consisting of a massive amount of modern smart devices, such as smartphones and IoT devices. However, the leading optimization algorithm in such settings, i.e., federated averaging (FedAvg), suffers from heavy communication cost and inevitable performance drop, especially when the local data is distributed in a non-IID way. To alleviate this problem, we propose two potential solutions by introducing additional mechanisms to the on-device training. The first (FedMMD) is adopting a two-stream model with the MMD (Maximum Mean Discrepancy) constraint instead of a single model in vanilla FedAvg to be trained on devices. Experiments show that the proposed method outperforms baselines, especially in non-IID FL settings, with a reduction of more than 20% in required communication rounds. The second is FL with feature fusion (FedFusion). By aggregating the features from both the local and global models, we achieve higher accuracy at less communication cost. Furthermore, the feature fusion modules offer better initialization for newly incoming clients and thus speed up the process of convergence. Experiments in popular FL scenarios show that our FedFusion outperforms baselines in both accuracy and generalization ability while reducing the number of required communication rounds by more than 60%.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Machine Learning

1908.05891

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

BOAH: A Tool Suite for Multi-Fidelity Bayesian Optimization & Analysis of Hyperparameters

Lindauer, Marius, Eggensperger, Katharina, Feurer, Matthias, Biedenkapp, André, Marben, Joshua, Müller, Philipp, Hutter, Frank

arXiv.org Artificial IntelligenceAug-16-2019

Hyperparameter optimization and neural architecture search can become prohibitively expensive for regular black-box Bayesian optimization because the training and evaluation of a single model can easily take several hours. To overcome this, we introduce a comprehensive tool suite for effective multi-fidelity Bayesian optimization and the analysis of its runs. The suite, written in Python, provides a simple way to specify complex design spaces, a robust and efficient combination of Bayesian optimization and HyperBand, and a comprehensive analysis of the optimization process and its outcomes.

artificial intelligence, machine learning, optimization, (15 more...)

arXiv.org Artificial Intelligence

1908.06756

Country: Europe > Germany (0.18)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Add feedback

Linear Stochastic Bandits Under Safety Constraints

Amani, Sanae, Alizadeh, Mahnoosh, Thrampoulidis, Christos

arXiv.org Machine LearningAug-15-2019

Bandit algorithms have various application in safety-critical systems, where it is important to respect the system constraints that rely on the bandit's unknown parameters at every round. In this paper, we formulate a linear stochastic multi-armed bandit problem with safety constraints that depend (linearly) on an unknown parameter vector. As such, the learner is unable to identify all safe actions and must act conservatively in ensuring that her actions satisfy the safety constraint at all rounds (at least with high probability). For these bandits, we propose a new UCB-based algorithm called Safe-LUCB, which includes necessary modifications to respect safety constraints. The algorithm has two phases. During the pure exploration phase the learner chooses her actions at random from a restricted set of safe actions with the goal of learning a good approximation of the entire unknown safe set. Once this goal is achieved, the algorithm begins a safe exploration-exploitation phase where the learner gradually expands their estimate of the set of safe actions while controlling the growth of regret. We provide a general regret bound for the algorithm, as well as a problem dependent bound that is connected to the location of the optimal action within the safe set. We then propose a modified heuristic that exploits our problem dependent analysis to improve the regret.

null, optimization problem, upstream oil & gas, (21 more...)

arXiv.org Machine Learning

1908.05814

Country:

North America > United States > California (0.14)
Europe > Spain (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.35)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Safe global optimization of expensive noisy black-box functions in the $\delta$-Lipschitz framework

Sergeyev, Yaroslav D., Candelieri, Antonio, Kvasov, Dmitri E., Perego, Riccardo

arXiv.org Machine LearningAug-15-2019

In this paper, the problem of safe global maximization (it should not be confused with robust optimization) of expensive noisy black-box functions satisfying the Lipschitz condition is considered. The notion "safe" means that the objective function $f(x)$ during optimization should not violate a "safety" threshold, for instance, a certain a priori given value $h$ in a maximization problem. Thus, any new function evaluation must be performed at "safe points" only, namely, at points $y$ for which it is known that the objective function $f(y) > h$. The main difficulty here consists in the fact that the used optimization algorithm should ensure that the safety constraint will be satisfied at a point $y$ before evaluation of $f(y)$ will be executed. Thus, it is required both to determine the safe region $\Omega$ within the search domain~$D$ and to find the global maximum within $\Omega$. An additional difficulty consists in the fact that these problems should be solved in the presence of the noise. This paper starts with a theoretical study of the problem and it is shown that even though the objective function $f(x)$ satisfies the Lipschitz condition, traditional Lipschitz minorants and majorants cannot be used due to the presence of the noise. Then, a $\delta$-Lipschitz framework and two algorithms using it are proposed to solve the safe global maximization problem. The first method determines the safe area within the search domain and the second one executes the global maximization over the found safe region. For both methods a number of theoretical results related to their functioning and convergence is established. Finally, numerical experiments confirming the reliability of the proposed procedures are performed.

artificial intelligence, optimization problem, safe region, (17 more...)

arXiv.org Machine Learning

1908.0601

Country: Europe > Italy (0.28)

Genre: Research Report (0.40)

Industry: Transportation > Air (0.61)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Multitask and Transfer Learning for Autotuning Exascale Applications

Sid-Lakhdar, Wissam M., Aznaveh, Mohsen Mahmoudi, Li, Xiaoye S., Demmel, James W.

arXiv.org Machine LearningAug-15-2019

Multitask learning and transfer learning have proven to be useful in the field of machine learning when additional knowledge is available to help a prediction task. We aim at deriving methods following these paradigms for use in autotuning, where the goal is to find the optimal performance parameters of an application treated as a black-box function. We show comparative results with state-of-the-art autotuning techniques. For instance, we observe an average $1.5x$ improvement of the application runtime compared to the OpenTuner and HpBandSter autotuners. We explain how our approaches can be more suitable than some state-of-the-art autotuners for the tuning of any application in general and of expensive exascale applications in particular.

application, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

1908.05792

Country:

North America > United States > Texas (0.28)
North America > United States > California (0.28)
North America > United States > New York (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Energy (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback