Goto

Collaborating Authors

 Search


IMLI: An Incremental Framework for MaxSAT-Based Learning of Interpretable Classification Rules

arXiv.org Artificial Intelligence

The wide adoption of machine learning in the critical domains such as medical diagnosis, law, education had propelled the need for interpretable techniques due to the need for end users to understand the reasoning behind decisions due to learning systems. The computational intractability of interpretable learning led practitioners to design heuristic techniques, which fail to provide sound handles to tradeoff accuracy and interpretability. Motivated by the success of MaxSA T solvers over the past decade, recently MaxSA T -based approach, called MLIC, was proposed that seeks to reduce the problem of learning interpretable rules expressed in Conjunctive Normal Form (CNF) to a MaxSA T query. While MLIC was shown to achieve accuracy similar to that of other state of the art black-box classifiers while generating small interpretable CNF formulas, the runtime performance of MLIC is significantly lagging and renders approach unusable in practice. In this context, authors raised the question: Is it possible to achieve the best of both worlds, i.e., a sound framework for interpretable learning that can take advantage of MaxSAT solvers while scaling to real-world instances? In this paper, we take a step towards answering the above question in affirmation. We propose IMLI: an incremental approach to MaxSA T based framework that achieves scalable runtime performance via partition-based training methodology. Extensive experiments on benchmarks arising from UCI repository demonstrate that IMLI achieves up to three orders of magnitude runtime improvement without loss of accuracy and interpretability.


Clustering Binary Data by Application of Combinatorial Optimization Heuristics

arXiv.org Machine Learning

We study clustering methods for binary data, first defining aggregation criteria that measure the compactness of clusters. Five new and original methods are introduced, using neighborhoods and population behavior combinatorial optimization metaheuristics: first ones are simulated annealing, threshold accepting and tabu search, and the others are a genetic algorithm and ant colony optimization. The methods are implemented, performing the proper calibration of parameters in the case of heuristics, to ensure good results. From a set of 16 data tables generated by a quasi-Monte Carlo experiment, a comparison is performed for one of the aggregations using L1 dissimilarity, with hierarchical clustering, and a version of k-means: partitioning around medoids or PAM. Simulated annealing perform very well, especially compared to classical methods.


Learning fine-grained search space pruning and heuristics for combinatorial optimization

arXiv.org Artificial Intelligence

Combinatorial optimization problems arise in a wide range of applications from diverse domains. Many of these problems are NP-hard and designing efficient heuristics for them requires considerable time and experimentation. On the other hand, the number of optimization problems in the industry continues to grow. In recent years, machine learning techniques have been explored to address this gap. We propose a framework for leveraging machine learning techniques to scale-up exact combinatorial optimization algorithms. In contrast to the existing approaches based on deep-learning, reinforcement learning and restricted Boltzmann machines that attempt to directly learn the output of the optimization problem from its input (with limited success), our framework learns the relatively simpler task of pruning the elements in order to reduce the size of the problem instances. In addition, our framework uses only interpretable learning models based on intuitive features and thus the learning process provides deeper insights into the optimization problem and the instance class, that can be used for designing better heuristics. For the classical maximum clique enumeration problem, we show that our framework can prune a large fraction of the input graph (around 99 % of nodes in case of sparse graphs) and still detect almost all of the maximum cliques. This results in several fold speedups of state-of-the-art algorithms. Furthermore, the model used in our framework highlights that the chi-squared value of neighborhood degree has a statistically significant correlation with the presence of a node in a maximum clique, particularly in dense graphs which constitute a significant challenge for modern solvers. We leverage this insight to design a novel heuristic for this problem outperforming the state-of-the-art. Our heuristic is also of independent interest for maximum clique detection and enumeration.


An adaptive data-driven approach to solve real-world vehicle routing problems in logistics

arXiv.org Artificial Intelligence

Transportation occupies one-third of the amount in the logistics costs, and accordingly transportation systems largely influence the performance of the logistics system. This work presents an adaptive data-driven innovative modular approach for solving the real-world Vehicle Routing Problems (VRP) in the field of logistics. The work consists of two basic units: (i) an innovative multi-step algorithm for successful and entirely feasible solving of the VRP problems in logistics, (ii) an adaptive approach for adjusting and setting up parameters and constants of the proposed algorithm. The proposed algorithm combines several data transformation approaches, heuristics and Tabu search. Moreover, as the performance of the algorithm depends on the set of control parameters and constants, a predictive model that adaptively adjusts these parameters and constants according to historical data is proposed. A comparison of the acquired results has been made using the Decision Support System with predictive models: Generalized Linear Models (GLM) and Support Vector Machine (SVM). The algorithm, along with the control parameters, which using the prediction method were acquired, was incorporated into a web-based enterprise system, which is in use in several big distribution companies in Bosnia and Herzegovina. The results of the proposed algorithm were compared with a set of benchmark instances and validated over real benchmark instances as well. The successful feasibility of the given routes, in a real environment, is also presented.


Decomposable Probability-of-Success Metrics in Algorithmic Search

arXiv.org Artificial Intelligence

There are three components to a search problem. The first is the finite discrete search space, Ω, which is the set of elements to be examined. Next is the target set, T, which is a nonempty subset of the search space that we are trying to find. Finally, we have an external information resource, F, which provides an evaluation of elements of the search space. Typically, there is a tight relationship between the target set and the external information resource, as the resource is expected to lead to or describe the target set in some way, such as the target set being elements which meet a certain threshold under the external information resource. Within the framework, we have an iterative algorithm which seeks to find elements of the target set, shown in Figure 1. The algorithm is a black-box that has access to a search history and produces a probability distribution over the search space. At each step, the algorithm samples over the search space using the probability distribution, evaluates that element using the information resource, adds the result to the search history, and determines the next probability distribution. The abstraction of finding the next probability distribution as a black-box algorithm allows the search framework to work with all types of search problems.


Minimax Semiparametric Learning With Approximate Sparsity

#artificialintelligence

There is a close correspondence between the minimax rate and the behavior of remainder terms in an asymptotic expansion of a doubly robust estimator around the average of the efficient influence function. A dominating remainder term is the product of the mean square norms of estimation errors for the regression and Riesz representer. Other remainder terms will be smaller order than this term. By virtue of the sum of the absolute values of the regression and Riesz representer coefficients being bounded, the estimation errors for both the regression and Riesz representer converge nearly at root-mean-square rate {ln(p)/n}1/4, as known for Lasso regression from Chatterjee and Jafarov (2015) and for the Riesz representer by Chernozhukov et al. (2018) and Chernozhukov, Newey, and Singh (2018). The minimax rate for the object of interest is ln(p)/n when max{ξ1,ξ2} 1/2, which is nearly the product of the two rates, i.e. the size of the dominating remainder.


Scalable NAS with Factorizable Architectural Parameters

arXiv.org Machine Learning

Neural architecture search (NAS) is an emerging topic in machine learning and computer vision. The fundamental ideology of NAS is using an automatic mechanism to replace manual designs for exploring powerful network architectures. One of the key factors of NAS is to scale-up the search space, e.g., increasing the number of operators, so that more possibilities are covered, but existing search algorithms often get lost in a large number of operators. This paper presents a scalable NAS algorithm by designing a factorizable set of architectural parameters, so that the size of the search space goes up quadratically while the burden of optimization increases linearly. As a practical example, we add a set of activation functions to the original set containing convolution, pooling and skip-connect, etc. With a marginal increase in search costs and no extra costs in retraining, we are able to find interesting architectures that were not explored before, and achieve state-of-the-art performance in CIF AR10 and ImageNet, two standard image classification benchmarks.


C. H. Robinson Uses Heuristics to Solve Rich Vehicle Routing Problems

arXiv.org Artificial Intelligence

We consider a wide family of vehicle routing problem variants with many complex and practical constraints, known as rich vehicle routing problems, which are faced on a daily basis by C.H. Robinson (CHR). Since CHR has many customers, each with distinct requirements, various routing problems with different objectives and constraints should be solved. We propose a set partitioning framework with a number of route generation algorithms, which have shown to be effective in solving a variety of different problems. The proposed algorithms have outperformed the existing technologies at CHR on 10 benchmark instances and since, have been embedded into the company's transportation planning and execution technology platform.


NAS evaluation is frustratingly hard

arXiv.org Machine Learning

Neural Architecture Search (NAS) is an exciting new field which promises to be as much as a game-changer as Convolutional Neural Networks were in 2012. Despite many great works leading to substantial improvements on a variety of tasks, comparison between different methods is still very much an open issue. While most algorithms are tested on the same datasets, there is no shared experimental protocol followed by all. As such, and due to the under-use of ablation studies, there is a lack of clarity regarding why certain methods are more effective than others. Our first contribution is a benchmark of $8$ NAS methods on $5$ datasets. To overcome the hurdle of comparing methods with different search spaces, we propose using a method's relative improvement over the randomly sampled average architecture, which effectively removes advantages arising from expertly engineered search spaces or training protocols. Surprisingly, we find that many NAS techniques struggle to significantly beat the average architecture baseline. We perform further experiments with the commonly used DARTS search space in order to understand the contribution of each component in the NAS pipeline. These experiments highlight that: (i) the use of tricks in the evaluation protocol has a predominant impact on the reported performance of architectures; (ii) the cell-based search space has a very narrow accuracy range, such that the seed has a considerable impact on architecture rankings; (iii) the hand-designed macro-structure (cells) is more important than the searched micro-structure (operations); and (iv) the depth-gap is a real phenomenon, evidenced by the change in rankings between $8$ and $20$ cell architectures. To conclude, we suggest best practices, that we hope will prove useful for the community and help mitigate current NAS pitfalls. The code used is available at https://github.com/antoyang/NAS-Benchmark.


Busca de melhor caminho entre múltiplas origens e múltiplos destinos em redes complexas que representam cidades

#artificialintelligence

Was investigated in this paper the use of a search strategy in the problem of finding the best path among multiple origins and multiple destinations. In this kind of problem, it must be decided within a lot of combinations which is the best origin and the best destination, and also the best path between these two regions. One remarkable difficulty to answer this sort of problem is to perform the search in a reduced time. This monography is a extension of previous research in which the problem described here was studied only in a bus network in the city of Fortaleza. This extension consisted of an exploration of the search strategy in graphs that represent public ways in cities like Fortaleza, Mumbai and Tokyo.