AITopics | Optimization

Collaborating Authors

Optimization

News Overviews Instructional Materials AI-Alerts Classics

A Modified Bayesian Optimization based Hyper-Parameter Tuning Approach for Extreme Gradient Boosting

arXiv.org Machine LearningApr-10-2020

It is already reported in the literature that the performance of a machine learning algorithm is greatly impacted by performing proper Hyper-Parameter optimization. One of the ways to perform Hyper-Parameter optimization is by manual search but that is time consuming. Some of the common approaches for performing Hyper-Parameter optimization are Grid search Random search and Bayesian optimization using Hyperopt. In this paper, we propose a brand new approach for hyperparameter improvement i.e. Randomized-Hyperopt and then tune the hyperparameters of the XGBoost i.e. the Extreme Gradient Boosting algorithm on ten datasets by applying Random search, Randomized-Hyperopt, Hyperopt and Grid Search. The performances of each of these four techniques were compared by taking both the prediction accuracy and the execution time into consideration. We find that the Randomized-Hyperopt performs better than the other three conventional methods for hyper-paramter optimization of XGBoost.

dataset, hyperparameter, optimization, (13 more...)

arXiv.org Machine Learning

doi: 10.1109/ICInPro47689.2019.9092025

2004.05041

Country:

Asia > India > Karnataka > Bengaluru (0.06)
Europe > France (0.04)
Asia > Taiwan (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.71)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
(2 more...)

Add feedback

An Introduction to Hill Climbing Algorithm - GreatLearning

#artificialintelligenceApr-8-2020, 12:15:37 GMT

There are diverse topics in the field of Artificial Intelligence and Machine learning. Research is required to find optimal solutions in this field. In Deep learning, various neural networks are used but optimization has been a very important step to find out the best solution for a good model. In the field of AI, many complex algorithms have been used. It is also important to find out an optimal solution.

algorithm, current state, optimal solution, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A fast and effective MIP-based heuristic for a selective and periodic inventory routing problem in reverse logistics

Cárdenas-Barrón, Leopoldo E., Melo, Rafael A.

arXiv.org Artificial IntelligenceApr-8-2020

We consider an NP-hard selective and periodic inventory routing problem (SPIRP) in a waste vegetable oil collection environment. This SPIRP arises in the context of reverse logistics where a biodiesel company has daily requirements of oil to be used as raw material in its production process. These requirements can be fulfilled by using the available inventory, collecting waste vegetable oil or purchasing virgin oil. The problem consists in determining a period (cyclic) planning for the collection and purchasing of oil such that the total collection, inventory and purchasing costs are minimized, while meeting the company's oil requirements and all the operational constraints. We propose a MIP-based heuristic which solves a relaxed model without routing, constructs routes taking into account the relaxation's solution and then improves these routes by solving the capacitated vehicle routing problem associated to each period. Following this approach, an a posteriori performance guarantee is ensured, as the approach provides both a lower bound and a feasible solution. The performed computational experiments show that the MIP-based heuristic is very fast and effective as it is able to encounter near optimal solutions with low gaps within seconds, improving several of the best known results using just a fraction of the time spent by a state-of-the-art heuristic. A remarkable fact is that the proposed MIP-based heuristic improves over the best known results for all the large instances available in the literature.

artificial intelligence, constraint, optimization problem, (18 more...)

arXiv.org Artificial Intelligence

2004.04188

Country:

South America > Brazil > Bahia (0.14)
North America > United States > New Jersey > Mercer County > Princeton (0.14)
North America > United States > Massachusetts (0.14)
North America > Mexico > Nuevo León (0.14)

Genre: Research Report (1.00)

Industry:

Transportation (1.00)
Energy > Renewable > Biofuel > Biodiesel (0.54)
Energy > Oil & Gas > Downstream (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.34)

Add feedback

An Improved Cutting Plane Method for Convex Optimization, Convex-Concave Games and its Applications

Jiang, Haotian, Lee, Yin Tat, Song, Zhao, Wong, Sam Chiu-wai

arXiv.org Machine LearningApr-8-2020

Given a separation oracle for a convex set $K \subset \mathbb{R}^n$ that is contained in a box of radius $R$, the goal is to either compute a point in $K$ or prove that $K$ does not contain a ball of radius $\epsilon$. We propose a new cutting plane algorithm that uses an optimal $O(n \log (\kappa))$ evaluations of the oracle and an additional $O(n^2)$ time per evaluation, where $\kappa = nR/\epsilon$. $\bullet$ This improves upon Vaidya's $O( \text{SO} \cdot n \log (\kappa) + n^{\omega+1} \log (\kappa))$ time algorithm [Vaidya, FOCS 1989a] in terms of polynomial dependence on $n$, where $\omega < 2.373$ is the exponent of matrix multiplication and $\text{SO}$ is the time for oracle evaluation. $\bullet$ This improves upon Lee-Sidford-Wong's $O( \text{SO} \cdot n \log (\kappa) + n^3 \log^{O(1)} (\kappa))$ time algorithm [Lee, Sidford and Wong, FOCS 2015] in terms of dependence on $\kappa$. For many important applications in economics, $\kappa = \Omega(\exp(n))$ and this leads to a significant difference between $\log(\kappa)$ and $\mathrm{poly}(\log (\kappa))$. We also provide evidence that the $n^2$ time per evaluation cannot be improved and thus our running time is optimal. A bottleneck of previous cutting plane methods is to compute leverage scores, a measure of the relative importance of past constraints. Our result is achieved by a novel multi-layered data structure for leverage score maintenance, which is a sophisticated combination of diverse techniques such as random projection, batched low-rank update, inverse maintenance, polynomial interpolation, and fast rectangular matrix multiplication. Interestingly, our method requires a combination of different fast rectangular matrix multiplication algorithms.

algorithm, data structure, plane method, (15 more...)

arXiv.org Machine Learning

2004.0425

Country:

Asia > Russia (0.04)
North America > United States > Massachusetts (0.04)
North America > United States > Virginia (0.04)
(4 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

GeneCAI: Genetic Evolution for Acquiring Compact AI

Javaheripi, Mojan, Samragh, Mohammad, Javidi, Tara, Koushanfar, Farinaz

arXiv.org Machine LearningApr-8-2020

In the contemporary big data realm, Deep Neural Networks (DNNs) are evolving towards more complex architectures to achieve higher inference accuracy. Model compression techniques can be leveraged to efficiently deploy such compute-intensive architectures on resource-limited mobile devices. Such methods comprise various hyper-parameters that require per-layer customization to ensure high accuracy. Choosing such hyper-parameters is cumbersome as the pertinent search space grows exponentially with model layers. This paper introduces GeneCAI, a novel optimization method that automatically learns how to tune per-layer compression hyper-parameters. We devise a bijective translation scheme that encodes compressed DNNs to the genotype space. The optimality of each genotype is measured using a multi-objective score based on accuracy and number of floating point operations. We develop customized genetic operations to iteratively evolve the non-dominated solutions towards the optimal Pareto front, thus, capturing the optimal trade-off between model accuracy and complexity. GeneCAI optimization method is highly scalable and can achieve a near-linear performance boost on distributed multi-GPU platforms. Our extensive evaluations demonstrate that GeneCAI outperforms existing rule-based and reinforcement learning methods in DNN compression by finding models that lie on a better accuracy-complexity Pareto curve.

accuracy, genecai, pruning, (16 more...)

arXiv.org Machine Learning

doi: 10.1145/3377930.3390226

2004.04249

Country:

North America > Mexico > Quintana Roo > Cancún (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Global Expanding, Local Shrinking: Discriminant Multi-label Learning with Missing Labels

Ma, Zhongchen, Chen, Songcan

arXiv.org Machine LearningApr-8-2020

In multi-label learning, the issue of missing labels brings a major challenge. Many methods attempt to recovery missing labels by exploiting low-rank structure of label matrix. However, these methods just utilize global low-rank label structure, ignore both local low-rank label structures and label discriminant information to some extent, leaving room for further performance improvement. In this paper, we develop a simple yet effective discriminant multi-label learning (DM2L) method for multi-label learning with missing labels. Specifically, we impose the low-rank structures on all the predictions of instances from the same labels (local shrinking of rank), and a maximally separated structure (high-rank structure) on the predictions of instances from different labels (global expanding of rank). In this way, these imposed low-rank structures can help modeling both local and global low-rank label structures, while the imposed high-rank structure can help providing more underlying discriminability. Our subsequent theoretical analysis also supports these intuitions. In addition, we provide a nonlinear extension via using kernel trick to enhance DM2L and establish a concave-convex objective to learn these models. Compared to the other methods, our method involves the fewest assumptions and only one hyper-parameter. Even so, extensive experiments show that our method still outperforms the state-of-the-art methods.

label matrix, label structure, multi-label learning, (15 more...)

arXiv.org Machine Learning

2004.03951

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
North America > United States > Massachusetts > Norfolk County > Wellesley (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Data Science > Data Mining (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Online Hyperparameter Search Interleaved with Proximal Parameter Updates

Lopez-Ramos, Luis Miguel, Beferull-Lozano, Baltasar

arXiv.org Machine LearningApr-6-2020

There is a clear need for efficient algorithms to tune hyperparameters for statistical learning schemes, since the commonly applied search methods (such as grid search with N-fold cross-validation) are inefficient and/or approximate. Previously existing algorithms that efficiently search for hyperparameters relying on the smoothness of the cost function cannot be applied in problems such as Lasso regression. In this contribution, we develop a hyperparameter optimization method that relies on the structure of proximal gradient methods and does not require a smooth cost function. Such a method is applied to Leave-one-out (LOO)-validated Lasso and Group Lasso to yield efficient, data-driven, hyperparameter optimization algorithms. Numerical experiments corroborate the convergence of the proposed method to a local optimum of the LOO validation error curve, and the efficiency of its approximations.

algorithm, hyperparameter, optimization, (15 more...)

arXiv.org Machine Learning

2004.02769

Country:

North America > United States > Montana (0.04)
Europe > Norway (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

Add feedback

The Bethe and Sinkhorn Permanents of Low Rank Matrices and Implications for Profile Maximum Likelihood

Anari, Nima, Charikar, Moses, Shiragur, Kirankumar, Sidford, Aaron

arXiv.org Machine LearningApr-6-2020

In this paper we consider the problem of computing the likelihood of the profile of a discrete distribution, i.e., the probability of observing the multiset of element frequencies, and computing a profile maximum likelihood (PML) distribution, i.e., a distribution with the maximum profile likelihood. For each problem we provide polynomial time algorithms that given $n$ i.i.d.\ samples from a discrete distribution, achieve an approximation factor of $\exp\left(-O(\sqrt{n} \log n) \right)$, improving upon the previous best-known bound achievable in polynomial time of $\exp(-O(n^{2/3} \log n))$ (Charikar, Shiragur and Sidford, 2019). Through the work of Acharya, Das, Orlitsky and Suresh (2016), this implies a polynomial time universal estimator for symmetric properties of discrete distributions in a broader range of error parameter. We achieve these results by providing new bounds on the quality of approximation of the Bethe and Sinkhorn permanents (Vontobel, 2012 and 2014). We show that each of these are $\exp(O(k \log(N/k)))$ approximations to the permanent of $N \times N$ matrices with non-negative rank at most $k$, improving upon the previous known bounds of $\exp(O(N))$. To obtain our results on PML, we exploit the fact that the PML objective is proportional to the permanent of a certain Vandermonde matrix with $\sqrt{n}$ distinct columns, i.e. with non-negative rank at most $\sqrt{n}$. As a by-product of our work we establish a surprising connection between the convex relaxation in prior work (CSS19) and the well-studied Bethe and Sinkhorn approximations.

algorithm, approximation, matrix, (16 more...)

arXiv.org Machine Learning

2004.02425

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.24)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(3 more...)

Genre:

Research Report (0.70)
Workflow (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.61)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

SINDy-PI: A Robust Algorithm for Parallel Implicit Sparse Identification of Nonlinear Dynamics

Kaheman, Kadierdan, Kutz, J. Nathan, Brunton, Steven L.

arXiv.org Machine LearningApr-5-2020

Accurately modeling the nonlinear dynamics of a system from measurement data is a challenging yet vital topic. The sparse identification of nonlinear dynamics (SINDy) algorithm is one approach to discover dynamical systems models from data. Although extensions have been developed to identify implicit dynamics, or dynamics described by rational functions, these extensions are extremely sensitive to noise. In this work, we develop SINDy-PI (parallel, implicit), a robust variant of the SINDy algorithm to identify implicit dynamics and rational nonlinearities. The SINDy-PI framework includes multiple optimization algorithms and a principled approach to model selection. We demonstrate the ability of this algorithm to learn implicit ordinary and partial differential equations and conservation laws from limited and noisy data. In particular, we show that the proposed approach is several orders of magnitude more noise robust than previous approaches, and may be used to identify a class of complex ODE and PDE dynamics that were previously unattainable with SINDy, including for the double pendulum dynamics and the Belousov Zhabotinsky (BZ) reaction.

equation, identification, sindy-pi, (15 more...)

arXiv.org Machine Learning

2004.02322

Country:

North America > United States > Washington > King County > Seattle (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

The two-echelon routing problem with truck and drones

Hà, Minh Hoàng, Vu, Lam, Vu, Duy Manh

arXiv.org Artificial IntelligenceApr-5-2020

In this paper, we study novel variants of the well-known two-echelon vehicle routing problem in which a truck works on the first echelon to transport parcels and a fleet of drones to intermediate depots while in the second echelon, the drones are used to deliver parcels from intermediate depots to customers. The objective is to minimize the completion time instead of the transportation cost as in classical 2-echelon vehicle routing problems. Depending on the context, a drone can be launched from the truck at an intermediate depot once (single trip drone) or several times (multiple trip drone). Mixed Integer Linear Programming (MILP) models are first proposed to formulate mathematically the problems and solve to optimality small-size instances. To handle larger instances, a metaheuristic based on the idea of Greedy Randomized Adaptive Search Procedure (GRASP) is introduced. Experimental results obtained on instances of different contexts are reported and analyzed.

customer, drone, truck node, (15 more...)

arXiv.org Artificial Intelligence

2004.02275

Country:

Europe > Italy > Piedmont > Turin Province > Turin (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > China (0.04)

Genre: Research Report (0.64)

Industry:

Transportation > Freight & Logistics Services (1.00)
Transportation > Ground > Road (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)

Add feedback