We present an algorithm for creating a neural network which produces accurateprobability estimates as outputs. The network implements aGibbs probability distribution model of the training database. This model is created by a new transformation relating the joint probabilities of attributes in the database to the weights (Gibbs potentials) of the distributed network model. The theory of this transformation is presented together with experimental results. Oneadvantage of this approach is the network weights are prescribed without iterative gradient descent. Used as a classifier the network tied or outperformed published results on a variety of databases.
We define and study the statistical models in exponential family form whose sufficient statistics are the degree distributions and the bi-degree distributions of undirected labelled simple graphs. Graphs that are constrained by the joint degree distributions are called $dK$-graphs in the computer science literature and this paper attempts to provide the first statistically grounded analysis of this type of models. In addition to formalizing these models, we provide some preliminary results for the parameter estimation and the asymptotic behaviour of the model for degree distribution, and discuss the parameter estimation for the model for bi-degree distribution.
Many sequential prediction tasks involve locating instances of patterns insequences. Generative probabilistic language models, such as hidden Markov models (HMMs), have been successfully applied to many of these tasks. A limitation of these models however, is that they cannot naturally handle cases in which pattern instances overlap in arbitrary ways. We present an alternative approach, based on conditional Markov networks, that can naturally represent arbitrarilyoverlapping elements. We show how to efficiently train and perform inference with these models. Experimental results froma genomics domain show that our models are more accurate at locating instances of overlapping patterns than are baseline models based on HMMs.
We describe computationally efficient methods for learning mixtures in which each component is a directed acyclic graphical model (mixtures of DAGs or MDAGs). We argue that simple search-and-score algorithms are infeasible for a variety of problems, and introduce a feasible approach in which parameter and structure search is interleaved and expected data is treated as real data. Our approach can be viewed as a combination of (1) the Cheeseman--Stutz asymptotic approximation for model posterior probability and (2) the Expectation--Maximization algorithm. We evaluate our procedure for selecting among MDAGs on synthetic and real examples.
The compact Genetic Algorithm (cGA) is an Estimation of Distribution Algorithm that generates offspring population according to the estimated probabilistic model of the parent population instead of using traditional recombination and mutation operators. The cGA only needs a small amount of memory; therefore, it may be quite useful in memory-constrained applications. This paper introduces a theoretical framework for studying the cGA from the convergence point of view in which, we model the cGA by a Markov process and approximate its behavior using an Ordinary Differential Equation (ODE). Then, we prove that the corresponding ODE converges to local optima and stays there. Consequently, we conclude that the cGA will converge to the local optima of the function to be optimized.