Search
A quantitative assessment of the effect of different algorithmic schemes to the task of learning the structure of Bayesian Networks
Beretta, Stefano, Castelli, Mauro, Goncalves, Ivo, Ramazzotti, Daniele
The task of learning a BN can be divided into two subtasks: (1) structural learning, i.e., identification of the topology of the BN, and (2) parametric learning, i.e., estimation of the numerical parameters (conditional probabilities) for a given network topology. In particular, the most challenging task of the two is the one of learning the structure of a BN. Different methods have been proposed to face this problem, and they can be classified into two categories [4, 5]: (1) methods based on detecting conditional independencies, also known as constraint-based methods, and (2) score search methods, also known as score-based approaches. As discussed in [6], the input of the former algorithms is a set of conditional independence relations between subsets of variables, which are used to build a BN that represents a large percentage (and, whenever possible, all) of these relations [7]. However, the number of conditional independence tests that such methods should perform is exponential and, thus, approximation techniques are required.
DeepArchitect: Automatically Designing and Training Deep Architectures
Negrinho, Renato, Gordon, Geoff
In deep learning, performance is strongly affected by the choice of architecture and hyperparameters. While there has been extensive work on automatic hyperparameter optimization for simple spaces, complex spaces such as the space of deep architectures remain largely unexplored. As a result, the choice of architecture is done manually by the human expert through a slow trial and error process guided mainly by intuition. In this paper we describe a framework for automatically designing and training deep models. We propose an extensible and modular language that allows the human expert to compactly represent complex search spaces over architectures and their hyperparameters. The resulting search spaces are tree-structured and therefore easy to traverse. Models can be automatically compiled to computational graphs once values for all hyperparameters have been chosen. We can leverage the structure of the search space to introduce different model search algorithms, such as random search, Monte Carlo tree search (MCTS), and sequential model-based optimization (SMBO). We present experiments comparing the different algorithms on CIFAR-10 and show that MCTS and SMBO outperform random search. In addition, these experiments show that our framework can be used effectively for model discovery, as it is possible to describe expressive search spaces and discover competitive models without much effort from the human expert. Code for our framework and experiments has been made publicly available.
Pruning variable selection ensembles
Zhang, Chunxia, Wu, Yilei, Zhu, Mu
In the context of variable selection, ensemble learning has gained increasing interest due to its great potential to improve selection accuracy and to reduce false discovery rate. A novel ordering-based selective ensemble learning strategy is designed in this paper to obtain smaller but more accurate ensembles. In particular, a greedy sorting strategy is proposed to rearrange the order by which the members are included into the integration process. Through stopping the fusion process early, a smaller subensemble with higher selection accuracy can be obtained. More importantly, the sequential inclusion criterion reveals the fundamental strength-diversity trade-off among ensemble members. By taking stability selection (abbreviated as StabSel) as an example, some experiments are conducted with both simulated and real-world data to examine the performance of the novel algorithm. Experimental results demonstrate that pruned StabSel generally achieves higher selection accuracy and lower false discovery rates than StabSel and several other benchmark methods.
Improving the Efficiency of Dynamic Programming on Tree Decompositions via Machine Learning
Abseher, Michael, Musliu, Nysret, Woltran, Stefan
Dynamic Programming (DP) over tree decompositions is a well-established method to solve problems - that are in general NP-hard - efficiently for instances of small treewidth. Experience shows that (i) heuristically computing a tree decomposition has negligible runtime compared to the DP step; and (ii) DP algorithms exhibit a high variance in runtime when using different tree decompositions; in fact, given an instance of the problem at hand, even decompositions of the same width might yield extremely diverging runtimes. We thus propose here a novel and general method that is based on selection of the best decomposition from an available pool of heuristically generated ones. For this purpose, we require machine learning techniques that provide automated selection based on features of the decomposition rather than on the actual problem instance. Thus, one main contribution of this work is to propose novel features for tree decompositions. Moreover, we report on extensive experiments in different problem domains which show a significant speedup when choosing the tree decomposition according to this concept over simply using an arbitrary one of the same width.
Identifying Consistent Statements about Numerical Data with Dispersion-Corrected Subgroup Discovery
Boley, Mario, Goldsmith, Bryan R., Ghiringhelli, Luca M., Vreeken, Jilles
Existing algorithms for subgroup discovery with numerical targets do not optimize the error or target variable dispersion of the groups they find. This often leads to unreliable or inconsistent statements about the data, rendering practical applications, especially in scientific domains, futile. Therefore, we here extend the optimistic estimator framework for optimal subgroup discovery to a new class of objective functions: we show how tight estimators can be computed efficiently for all functions that are determined by subgroup size (non-decreasing dependence), the subgroup median value, and a dispersion measure around the median (non-increasing dependence). In the important special case when dispersion is measured using the average absolute deviation from the median, this novel approach yields a linear time algorithm. Empirical evaluation on a wide range of datasets shows that, when used within branch-and-bound search, this approach is highly efficient and indeed discovers subgroups with much smaller errors.
A Distributed Approach for Networked Flying Platform Association with Small Cells in 5G+ Networks
Shah, Syed Awais Wahab, Khattab, Tamer, Shakir, Muhammad Zeeshan, Hasna, Mazen Omar
The densification of small-cell base stations in a 5G architecture is a promising approach to enhance the coverage area and facilitate the ever increasing capacity demand of end users. However, the bottleneck is an intelligent management of a backhaul/fronthaul network for these small-cell base stations. This involves efficient association and placement of the backhaul hubs that connects these small-cells with the core network. Terrestrial hubs suffer from an inefficient non line of sight link limitations and unavailability of a proper infrastructure in an urban area. Seeing the popularity of flying platforms, we employ here an idea of using networked flying platform (NFP) such as unmanned aerial vehicles (UAVs), drones, unmanned balloons flying at different altitudes, as aerial backhaul hubs. The association problem of these NFP-hubs and small-cell base stations is formulated considering backhaul link and NFP related limitations such as maximum number of supported links and bandwidth. Then, this paper presents an efficient and distributed solution of the designed problem, which performs a greedy search in order to maximize the sum rate of the overall network. A favorable performance is observed via a numerical comparison of our proposed method with optimal exhaustive search algorithm in terms of sum rate and run-time speed.
The Famine of Forte: Few Search Problems Greatly Favor Your Algorithm
Casting machine learning as a type of search, we demonstrate that the proportion of problems that are favorable for a fixed algorithm is strictly bounded, such that no single algorithm can perform well over a large fraction of them. Our results explain why we must either continue to develop new learning methods year after year or move towards highly parameterized models that are both flexible and sensitive to their hyperparameters. We further give an upper bound on the expected performance for a search algorithm as a function of the mutual information between the target and the information resource (e.g., training dataset), proving the importance of certain types of dependence for machine learning. Lastly, we show that the expected per-query probability of success for an algorithm is mathematically equivalent to a single-query probability of success under a distribution (called a search strategy), and prove that the proportion of favorable strategies is also strictly bounded. Thus, whether one holds fixed the search algorithm and considers all possible problems or one fixes the search problem and looks at all possible search strategies, favorable matches are exceedingly rare. The forte (strength) of any algorithm is quantifiably restricted.
China Pushes Breadth-First Search Across Ten Million Cores
There is increasing interplay between the worlds of machine learning and high performance computing (HPC). This began with a shared hardware and software story since many supercomputing tricks of the trade play well into deep learning, but as we look to next generation machines, the bond keeps tightening. Many supercomputing sites are figuring out how to work deep learning into their existing workflows, either as a pre- or post-processing step, while some research areas might do away with traditional supercomputing simulations altogether eventually. While these massive machines were designed with simulations in mind, the strongest supers have architectures that parallel the unique requirements of training and inference workloads. One such system in the U.S. is the future Summit supercomputer coming to Oak Ridge National Lab later this year, but many of the other architectures that are especially sporting for machine learning are in China and Japan--and feature non-standard processing elements.
Check out the world's biggest freestanding Rubik's cube
Steve Miller Band's The Joker was on the FM dial and Mohammed Ali had trounced George Forman to regain his heavyweight title in the much hyped (if questionably named) Rumble in the Jungle. Inflation had reached double digit highs, gasoline was in short supply, President Richard Nixon stepped down in the wake of the Watergate Scandal, and a 29-year-old Hungarian architect by the name of Erno Rubik had finally figured out how he could take a block made of smaller blocks and get the smaller cubes to move without causing the whole structure to fall apart. And thus, the Rubik's cube, which would eventually go on to delight and torment millions, was born. Ostensible a toy, Rubik's cube has since been the subject of everything from high school math class experiments to serious research in computer science. This week, students at the University of Michigan did Rubik one better, building the world's largest freestanding cube and overcoming a design challenge not too dissimilar from Rubik's original quandry--how to get the cubes to spin.