Energy
Using CODEQ to Train Feed-forward Neural Networks
Omran, Mahamed G. H., al-Adwani, Faisal
CODEQ is a new, population-based meta-heuristic algorithm that is a hybrid of concepts from chaotic search, opposition-based learning, differential evolution and quantum mechanics. CODEQ has successfully been used to solve different types of problems (e.g. constrained, integer-programming, engineering) with excellent results. In this paper, CODEQ is used to train feed-forward neural networks. The proposed method is compared with particle swarm optimization and differential evolution algorithms on three data sets with encouraging results.
Classifying Network Data with Deep Kernel Machines
Inspired by a growing interest in analyzing network data, we study the problem of node classification on graphs, focusing on approaches based on kernel machines. Conventionally, kernel machines are linear classifiers in the implicit feature space. We argue that linear classification in the feature space of kernels commonly used for graphs is often not enough to produce good results. When this is the case, one naturally considers nonlinear classifiers in the feature space. We show that repeating this process produces something we call "deep kernel machines." We provide some examples where deep kernel machines can make a big difference in classification performance, and point out some connections to various recent literature on deep architectures in artificial intelligence and machine learning.
Scalable Bayesian reduced-order models for high-dimensional multiscale dynamical systems
Koutsourelakis, P. S., Bilionis, Elias
While existing mathematical descriptions can accurately account for phenomena at microscopic scales (e.g. molecular dynamics), these are often high-dimensional, stochastic and their applicability over macroscopic time scales of physical interest is computationally infeasible or impractical. In complex systems, with limited physical insight on the coherent behavior of their constituents, the only available information is data obtained from simulations of the trajectories of huge numbers of degrees of freedom over microscopic time scales. This paper discusses a Bayesian approach to deriving probabilistic coarse-grained models that simultaneously address the problems of identifying appropriate reduced coordinates and the effective dynamics in this lower-dimensional representation. At the core of the models proposed lie simple, low-dimensional dynamical systems which serve as the building blocks of the global model. These approximate the latent, generating sources and parameterize the reduced-order dynamics. We discuss parallelizable, online inference and learning algorithms that employ Sequential Monte Carlo samplers and scale linearly with the dimensionality of the observed dynamics. We propose a Bayesian adaptive time-integration scheme that utilizes probabilistic predictive estimates and enables rigorous concurrent s imulation over macroscopic time scales. The data-driven perspective advocated assimilates computational and experimental data and thus can materialize data-model fusion. It can deal with applications that lack a mathematical description and where only observational data is available. Furthermore, it makes non-intrusive use of existing computational models.
The ILIUM forward modelling algorithm for multivariate parameter estimation and its application to derive stellar parameters from Gaia spectrophotometry
I introduce an algorithm for estimating parameters from multidimensional data based on forward modelling. In contrast to many machine learning approaches it avoids fitting an inverse model and the problems associated with this. The algorithm makes explicit use of the sensitivities of the data to the parameters, with the goal of better treating parameters which only have a weak impact on the data. The forward modelling approach provides uncertainty (full covariance) estimates in the predicted parameters as well as a goodness-of-fit for observations. I demonstrate the algorithm, ILIUM, with the estimation of stellar astrophysical parameters (APs) from simulations of the low resolution spectrophotometry to be obtained by Gaia. The AP accuracy is competitive with that obtained by a support vector machine. For example, for zero extinction stars covering a wide range of metallicity, surface gravity and temperature, ILIUM can estimate Teff to an accuracy of 0.3% at G=15 and to 4% for (lower signal-to-noise ratio) spectra at G=20. [Fe/H] and logg can be estimated to accuracies of 0.1-0.4dex for stars with G<=18.5. If extinction varies a priori over a wide range (Av=0-10mag), then Teff and Av can be estimated quite accurately (3-4% and 0.1-0.2mag respectively at G=15), but there is a strong and ubiquitous degeneracy in these parameters which limits our ability to estimate either accurately at faint magnitudes. Using the forward model we can map these degeneracies (in advance), and thus provide a complete probability distribution over solutions. (Abridged)
Learning to Explore and Exploit in POMDPs
Cai, Chenghui, Liao, Xuejun, Carin, Lawrence
A fundamental objective in reinforcement learning is the maintenance of a proper balance between exploration and exploitation. This problem becomes more challenging when the agent can only partially observe the states of its environment. In this paper we propose a dual-policy method for jointly learning the agent behavior and the balance between exploration exploitation, in partially observable environments. The method subsumes traditional exploration, in which the agent takes actions to gather information about the environment, and active learning, in which the agent queries an oracle for optimal actions (with an associated cost for employing the oracle). The form of the employed exploration is dictated by the specific problem. Theoretical guarantees are provided concerning the optimality of the balancing of exploration and exploitation. The effectiveness of the method is demonstrated by experimental results on benchmark problems.
Sharing Features among Dynamical Systems with Beta Processes
Fox, Emily, Jordan, Michael I., Sudderth, Erik B., Willsky, Alan S.
We propose a Bayesian nonparametric approach to relating multiple time series via a set of latent, dynamical behaviors. Using a beta process prior, we allow data-driven selection of the size of this set, as well as the pattern with which behaviors are shared among time series. Via the Indian buffet process representation of the beta process predictive distributions, we develop an exact Markov chain Monte Carlo inference method. In particular, our approach uses the sum-product algorithm to efficiently compute Metropolis-Hastings acceptance probabilities, and explores new dynamical behaviors via birth/death proposals. We validate our sampling algorithm using several synthetic datasets, and also demonstrate promising unsupervised segmentation of visual motion capture data.
Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations
Zhou, Mingyuan, Chen, Haojun, Ren, Lu, Sapiro, Guillermo, Carin, Lawrence, Paisley, John W.
Non-parametric Bayesian techniques are considered for learning dictionaries for sparse image representations, with applications in denoising, inpainting and compressive sensing (CS). The beta process is employed as a prior for learning the dictionary, and this non-parametric method naturally infers an appropriate dictionary size. The Dirichlet process and a probit stick-breaking process are also considered to exploit structure within an image. The proposed method can learn a sparse dictionary in situ; training images may be exploited if available, but they are not required. Further, the noise variance need not be known, and can be non-stationary. Another virtue of the proposed method is that sequential inference can be readily employed, thereby allowing scaling to large images. Several example results are presented, using both Gibbs and variational Bayesian inference, with comparisons to other state-of-the-art approaches.
Continuously-adaptive discretization for message-passing algorithms
Isard, Michael, MacCormick, John, Achan, Kannan
Continuously-Adaptive Discretization for Message-Passing (CAD-MP) is a new message-passing algorithm employing adaptive discretization. Most previous message-passing algorithms approximated arbitrary continuous probability distributions using either: a family of continuous distributions such as the exponential family; a particle-set of discrete samples; or a fixed, uniform discretization. In contrast, CAD-MP uses a discretization that is (i) non-uniform, and (ii) adaptive. The non-uniformity allows CAD-MP to localize interesting features (such as sharp peaks) in the marginal belief distributions with time complexity that scales logarithmically with precision, as opposed to uniform discretization which scales at best linearly. We give a principled method for altering the non-uniform discretization according to information-based measures. CAD-MP is shown in experiments on simulated data to estimate marginal beliefs much more precisely than competing approaches for the same computational expense.
Counting Solution Clusters in Graph Coloring Problems Using Belief Propagation
Kroc, Lukas, Sabharwal, Ashish, Selman, Bart
We show that an important and computationally challenging solution space feature of the graph coloring problem (COL), namely the number of clusters of solutions, can be accurately estimated by a technique very similar to one for counting the number of solutions. This cluster counting approach can be naturally written in terms of a new factor graph derived from the factor graph representing the COL instance. Using a variant of the Belief Propagation inference framework, we can efficiently approximate cluster counts in random COL problems over a large range of graph densities. We illustrate the algorithm on instances with up to 100, 000 vertices. Moreover, we supply a methodology for computing the number of clusters exactlyusing advanced techniques from the knowledge compilation literature.
Bounds on marginal probability distributions
Mooij, Joris M., Kappen, Hilbert J.
We propose a novel bound on single-variable marginal probability distributions in factor graphs with discrete variables. The bound is obtained by propagating local bounds (convex sets of probability distributions) over a subtree of the factor graph, rooted in the variable of interest. By construction, the method not only bounds the exact marginal probability distribution of a variable, but also its approximate Belief Propagation marginal ("belief"). Thus, apart from providing a practical means to calculate bounds on marginals, our contribution also lies in providing a better understanding of the error made by Belief Propagation. We show that our bound outperforms the state-of-the-art on some inference problems arising in medical diagnosis.