Goto

Collaborating Authors

 hill climbing


Learning complex dependency structure of gene regulatory networks from high dimensional micro-array data with Gaussian Bayesian networks

arXiv.org Artificial Intelligence

Gene expression datasets consist of thousand of genes with relatively small samplesizes (i.e. are large-$p$-small-$n$). Moreover, dependencies of various orders co-exist in the datasets. In the Undirected probabilistic Graphical Model (UGM) framework the Glasso algorithm has been proposed to deal with high dimensional micro-array datasets forcing sparsity. Also, modifications of the default Glasso algorithm are developed to overcome the problem of complex interaction structure. In this work we advocate the use of a simple score-based Hill Climbing algorithm (HC) that learns Gaussian Bayesian Networks (BNs) leaning on Directed Acyclic Graphs (DAGs). We compare HC with Glasso and its modifications in the UGM framework on their capability to reconstruct GRNs from micro-array data belonging to the Escherichia Coli genome. We benefit from the analytical properties of the Joint Probability Density (JPD) function on which both directed and undirected PGMs build to convert DAGs to UGMs. We conclude that dependencies in complex data are learned best by the HC algorithm, presenting them most accurately and efficiently, simultaneously modelling strong local and weaker but significant global connections coexisting in the gene expression dataset. The HC algorithm adapts intrinsically to the complex dependency structure of the dataset, without forcing a specific structure in advance. On the contrary, Glasso and modifications model unnecessary dependencies at the expense of the probabilistic information in the network and of a structural bias in the JPD function that can only be relieved including many parameters.


Hill Climbing and Simulated Annealing AI Algorithms

#artificialintelligence

Redeem Get Udemy Coupon What you'll learn Udemy Coupon Best Description Search Algorithms and Optimization techniques are the engines of most Artificial Intelligence techniques and Data Science. There is no doubt that Hill Climbing and Simulated Annealing are the most well-regarded and widely used AI search techniques. A lot of scientists and practitioners use search and optimization algorithms without understanding their internal structure. However, understanding the internal structure and mechanism of such AI problem-solving techniques will allow them to solve problems more efficiently. This also allows them to tune, tweak, and even design new algorithms for different projects.


Diversified Late Acceptance Search

arXiv.org Artificial Intelligence

The well-known Late Acceptance Hill Climbing (LAHC) search aims to overcome the main downside of traditional Hill Climbing (HC) search, which is often quickly trapped in a local optimum due to strictly accepting only non-worsening moves within each iteration. In contrast, LAHC also accepts worsening moves, by keeping a circular array of fitness values of previously visited solutions and comparing the fitness values of candidate solutions against the least recent element in the array. While the straightforward strategy followed by LAHC has proven effective, there are nevertheless situations where LAHC can unfortunately behave in a similar manner to HC, even when using a large fitness array. For example, when the same fitness value is stored many times in the array, particularly when a new local optimum is found. To address this shortcoming, we propose to improve both the diversity of the accepted solutions and the diversity of values in the array through new acceptance and replacement strategies. The proposed Diversified Late Acceptance Search approach is shown to outperform the current state-of-the-art LAHC method on benchmark sets of Travelling Salesman Problem and Quadratic Assignment Problem instances.


On learning the structure of Bayesian Networks and submodular function maximization

arXiv.org Machine Learning

Learning the structure of dependencies among multiple random variables is a problem of considerable theoretical and practical interest. In practice, score optimisation with multiple restarts provides a practical and surprisingly successful solution, yet the conditions under which this may be a well founded strategy are poorly understood. In this paper, we prove that the problem of identifying the structure of a Bayesian Network via regularised score optimisation can be recast, in expectation, as a submodular optimisation problem, thus guaranteeing optimality with high probability. This result both explains the practical success of optimisation heuristics, and suggests a way to improve on such algorithms by artificially simulating multiple data sets via a bootstrap procedure. We show on several synthetic data sets that the resulting algorithm yields better recovery performance than the state of the art, and illustrate in a real cancer genomic study how such an approach can lead to valuable practical insights.


A quantitative assessment of the effect of different algorithmic schemes to the task of learning the structure of Bayesian Networks

arXiv.org Machine Learning

The task of learning a BN can be divided into two subtasks: (1) structural learning, i.e., identification of the topology of the BN, and (2) parametric learning, i.e., estimation of the numerical parameters (conditional probabilities) for a given network topology. In particular, the most challenging task of the two is the one of learning the structure of a BN. Different methods have been proposed to face this problem, and they can be classified into two categories [4, 5]: (1) methods based on detecting conditional independencies, also known as constraint-based methods, and (2) score search methods, also known as score-based approaches. As discussed in [6], the input of the former algorithms is a set of conditional independence relations between subsets of variables, which are used to build a BN that represents a large percentage (and, whenever possible, all) of these relations [7]. However, the number of conditional independence tests that such methods should perform is exponential and, thus, approximation techniques are required.


A new approach in machine learning

arXiv.org Machine Learning

In this technical report we presented a novel approach to machine learning. Once the new framework is presented, we will provide a simple and yet very powerful learning algorithm which will be benchmark on various dataset. The framework we proposed is based on booleen circuits; more specifically the classifier produced by our algorithm have that form. Using bits and boolean gates instead of real numbers and multiplication enable the the learning algorithm and classifier to use very efficient boolean vector operations. This enable both the learning algorithm and classifier to be extremely efficient. The accuracy of the classifier we obtain with our framework compares very favorably those produced by conventional techniques, both in terms of efficiency and accuracy.


Steepest Ascent Hill Climbing For A Mathematical Problem

arXiv.org Artificial Intelligence

The paper proposes artificial intelligence technique called hill climbing to find numerical solutions of Diophantine Equations. Such equations are important as they have many applications in fields like public key cryptography, integer factorization, algebraic curves, projective curves and data dependency in super computers. Importantly, it has been proved that there is no general method to find solutions of such equations. This paper is an attempt to find numerical solutions of Diophantine equations using steepest ascent version of Hill Climbing. The method, which uses tree representation to depict possible solutions of Diophantine equations, adopts a novel methodology to generate successors. The heuristic function used help to make the process of finding solution as a minimization process. The work illustrates the effectiveness of the proposed methodology using a class of Diophantine equations given by a1. x1 p1 + a2. x2 p2 + ...... + an . xn pn = N where ai and N are integers. The experimental results validate that the procedure proposed is successful in finding solutions of Diophantine Equations with sufficiently large powers and large number of variables.


Heuristic Search and Information Visualization Methods for School Redistricting

AI Magazine

We describe an application of AI search and information visualization techniques to the problem of school redistricting, in which students are assigned to home schools within a county or school district. This is a multicriteria optimization problem in which competing objectives, such as school capacity, busing costs, and socioeconomic distribution, must be considered. Because of the complexity of the decision-making problem, tools are needed to help end users generate, evaluate, and compare alternative school assignment plans. A key goal of our research is to aid users in finding multiple qualitatively different redistricting plans that represent different trade-offs in the decision space. We present heuristic search methods that can be used to find a set of qualitatively different plans, and give empirical results of these search methods on population data from the school district of Howard County, Maryland. We show the resulting plans using novel visualization methods that we have developed for summarizing and comparing alternative plans.