If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."
However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …
In X-ray binary star systems consisting of a compact object that accretes material from an orbiting secondary star, there is no straightforward means to decide if the compact object is a black hole or a neutron star. To assist this classification, we develop a Bayesian statistical model that makes use of the fact that X-ray binary systems appear to cluster based on their compact object type when viewed from a 3-dimensional coordinate system derived from X-ray spectral data. The first coordinate of this data is the ratio of counts in mid to low energy band (color 1), the second coordinate is the ratio of counts in high to low energy band (color 2), and the third coordinate is the sum of counts in all three bands. We use this model to estimate the probabilities that an X-ray binary system contains a black hole, non-pulsing neutron star, or pulsing neutron star. In particular, we utilize a latent variable model in which the latent variables follow a Gaussian process prior distribution, and hence we are able to induce the spatial correlation we believe exists between systems of the same type. The utility of this approach is evidenced by the accurate prediction of system types using Rossi X-ray Timing Explorer All Sky Monitor data, but it is not flawless. In particular, non-pulsing neutron systems containing "bursters" that are close to the boundary demarcating systems containing black holes tend to be classified as black hole systems. As a byproduct of our analyses, we provide the astronomer with public R code that can be used to predict the compact object type of X-ray binaries given training data.
The spectral energy distribution (SED) is a relatively easy way for astronomers to distinguish between different astronomical objects such as galaxies, black holes, and stellar objects. By comparing the observations from a source at different frequencies with template models, astronomers are able to infer the type of this observed object. In this paper, we take a Bayesian model averaging perspective to learn astronomical objects, employing a Bayesian nonparametric approach to accommodate the deviation from convex combinations of known log-SEDs. To effectively use telescope time for observations, we then study Bayesian nonparametric sequential experimental design without conjugacy, in which we use sequential Monte Carlo as an efficient tool to maximize the volume of information stored in the posterior distribution of the parameters of interest. A new technique for performing inferences in log-Gaussian Cox processes called the Poisson log-normal approximation is also proposed. Simulations show the speed, accuracy, and usefulness of our method. While the strategy we propose in this paper is brand new in the astronomy literature, the inferential techniques developed apply to more general nonparametric sequential experimental design problems.
We propose a novel diverse feature selection method based on determinantal point processes (DPPs). Our model enables one to flexibly define diversity based on the covariance of features (similar to orthogonal matching pursuit) or alternatively based on side information. We introduce our approach in the context of Bayesian sparse regression, employing a DPP as a variational approximation to the true spike and slab posterior distribution. We subsequently show how this variational DPP approximation generalizes and extends mean-field approximation, and can be learned efficiently by exploiting the fast sampling properties of DPPs. Our motivating application comes from bioinformatics, where we aim to identify a diverse set of genes whose expression profiles predict a tumor type where the diversity is defined with respect to a gene-gene interaction network. We also explore an application in spatial statistics. In both cases, we demonstrate that the proposed method yields significantly more diverse feature sets than classic sparse methods, without compromising accuracy.
We develop a machine learning approach to represent and analyze the underlying spatial structure that governs shot selection among professional basketball players in the NBA. Typically, NBA players are discussed and compared in an heuristic, imprecise manner that relies on unmeasured intuitions about player behavior. This makes it difficult to draw comparisons between players and make accurate player specific predictions. Modeling shot attempt data as a point process, we create a low dimensional representation of offensive player types in the NBA. Using non-negative matrix factorization (NMF), an unsupervised dimensionality reduction technique, we show that a low-rank spatial decomposition summarizes the shooting habits of NBA players. The spatial representations discovered by the algorithm correspond to intuitive descriptions of NBA player types, and can be used to model other spatial effects, such as shooting accuracy.
Cherkassky, Michael, Bornn, Luke
In this paper we propose a flexible and efficient framework for handling multi-armed bandits, combining sequential Monte Carlo algorithms with hierarchical Bayesian modeling techniques. The framework naturally encompasses restless bandits, contextual bandits, and other bandit variants under a single inferential model. Despite the model's generality, we propose efficient Monte Carlo algorithms to make inference scalable, based on recent developments in sequential Monte Carlo methods. Through two simulation studies, the framework is shown to outperform other empirical methods, while also naturally scaling to more complex problems for which existing approaches can not cope. Additionally, we successfully apply our framework to online video-based advertising recommendation, and show its increased efficacy as compared to current state of the art bandit algorithms.
In this short note, we show how the parallel adaptive Wang-Landau (PAWL) algorithm of Bornn et al. (2013) can be used to automate and improve simulated tempering algorithms. While Wang-Landau and other stochastic approximation methods have frequently been applied within the simulated tempering framework, this note demonstrates through a simple example the additional improvements brought about by parallelization, adaptive proposals and automated bin splitting.
The Gibbs sampler is one of the most popular algorithms for inference in statistical models. In this paper, we introduce a herding variant of this algorithm, called herded Gibbs, that is entirely deterministic. We prove that herded Gibbs has an $O(1/T)$ convergence rate for models with independent variables and for fully connected probabilistic graphical models. Herded Gibbs is shown to outperform Gibbs in the tasks of image denoising with MRFs and named entity recognition with CRFs. However, the convergence for herded Gibbs for sparsely connected probabilistic graphical models is still an open problem.
Bornn, Luke, Caron, François
In this paper we propose a class of prior distributions on decomposable graphs, allowing for improved modeling flexibility. While existing methods solely penalize the number of edges, the proposed work empowers practitioners to control clustering, level of separation, and other features of the graph. Emphasis is placed on a particular prior distribution which derives its motivation from the class of product partition models; the properties of this prior relative to existing priors is examined through theory and simulation. We then demonstrate the use of graphical models in the field of agriculture, showing how the proposed prior distribution alleviates the inflexibility of previous approaches in properly modeling the interactions between the yield of different crop varieties.
Sparsity-promoting priors have become increasingly popular over recent years due to an increased number of regression and classification applications involving a large number of predictors. In time series applications where observations are collected over time, it is often unrealistic to assume that the underlying sparsity pattern is fixed. We propose here an original class of flexible Bayesian linear models for dynamic sparsity modelling. The proposed class of models expands upon the existing Bayesian literature on sparse regression using generalized multivariate hyperbolic distributions. The properties of the models are explored through both analytic results and simulation studies. We demonstrate the model on a financial application where it is shown that it accurately represents the patterns seen in the analysis of stock and derivative data, and is able to detect major events by filtering an artificial portfolio of assets.
This technical report is the union of two contributions to the discussion of the Read Paper "Riemann manifold Langevin and Hamiltonian Monte Carlo methods" by B. Calderhead and M. Girolami, presented in front of the Royal Statistical Society on October 13th 2010 and to appear in the Journal of the Royal Statistical Society Series B. The first comment establishes a parallel and possible interactions with Adaptive Monte Carlo methods. The second comment exposes a detailed study of Riemannian Manifold Hamiltonian Monte Carlo (RMHMC) for a weakly identifiable model presenting a strong ridge in its geometry.