Directed Networks
Constrained Mixture Models for Asset Returns Modelling
The estimation of asset return distributions is crucial for determining optimal trading strategies. One convenient estimation approach selects a distribution model and estimates its parameters. The advantage of this approach is the ease with which probability distributions can be calibrated and applied in post-processing. The disadvantage of assuming a particular parametric distribution is that inferences and decisions depend critically on the choice of distribution. For example, asset returns frequently feature large "outlying" values, making distributions with light tails inapplicable. Semi-parametric methods attempt to capture the advantages but not the disadvantages of a parametric specification of a returns distribution by using a more flexible functional form. Most prominent among the semi-parametric distributions are mixtures of distributions. They provide a flexible specification and, under certain conditions, can approximate distributions of any form.
GRASP and path-relinking for Coalition Structure Generation
Di Mauro, Nicola, Basile, Teresa M. A., Ferilli, Stefano, Esposito, Floriana
In Artificial Intelligence with Coalition Structure Generation (CSG) one refers to those cooperative complex problems that require to find an optimal partition, maximising a social welfare, of a set of entities involved in a system into exhaustive and disjoint coalitions. The solution of the CSG problem finds applications in many fields such as Machine Learning (covering machines, clustering), Data Mining (decision tree, discretization), Graph Theory, Natural Language Processing (aggregation), Semantic Web (service composition), and Bioinformatics. The problem of finding the optimal coalition structure is NP-complete. In this paper we present a greedy adaptive search procedure (GRASP) with path-relinking to efficiently search the space of coalition structures. Experiments and comparisons to other algorithms prove the validity of the proposed method in solving this hard combinatorial problem.
Regularization Strategies and Empirical Bayesian Learning for MKL
Multiple kernel learning (MKL), structured sparsity, and multi-task learning have recently received considerable attention. In this paper, we show how different MKL algorithms can be understood as applications of either regularization on the kernel weights or block-norm-based regularization, which is more common in structured sparsity and multi-task learning. We show that these two regularization strategies can be systematically mapped to each other through a concave conjugate operation. When the kernel-weight-based regularizer is separable into components, we can naturally consider a generative probabilistic model behind MKL. Based on this model, we propose learning algorithms for the kernel weights through the maximization of marginal likelihood. We show through numerical experiments that $\ell_2$-norm MKL and Elastic-net MKL achieve comparable accuracy to uniform kernel combination. Although uniform kernel combination might be preferable from its simplicity, $\ell_2$-norm MKL and Elastic-net MKL can learn the usefulness of the information sources represented as kernels. In particular, Elastic-net MKL achieves sparsity in the kernel weights.
Variational approximation for heteroscedastic linear models and matching pursuit algorithms
Nott, David J., Tran, Minh-Ngoc, Leng, Chenlei
Modern statistical applications involving large data sets have focused attention on statistical methodologies which are both efficient computationally and able to deal with the screening of large numbers of different candidate models. Here we consider computationally efficient variational Bayes approaches to inference in high-dimensional heteroscedastic linear regression, where both the mean and variance are described in terms of linear functions of the predictors and where the number of predictors can be larger than the sample size. We derive a closed form variational lower bound on the log marginal likelihood useful for model selection, and propose a novel fast greedy search algorithm on the model space which makes use of one step optimization updates to the variational lower bound in the current model for screening large numbers of candidate predictor variables for inclusion/exclusion in a computationally thrifty way. We show that the model search strategy we suggest is related to widely used orthogonal matching pursuit algorithms for model search but yields a framework for potentially extending these algorithms to more complex models. The methodology is applied in simulations and in two real examples involving prediction for food constituents using NIR technology and prediction of disease progression in diabetes.
Bayesian Nonparametric Covariance Regression
Although there is a rich literature on methods for allowing the variance in a univariate regression model to vary with predictors, time and other factors, relatively little has been done in the multivariate case. Our focus is on developing a class of nonparametric covariance regression models, which allow an unknown p x p covariance matrix to change flexibly with predictors. The proposed modeling framework induces a prior on a collection of covariance matrices indexed by predictors through priors for predictor-dependent loadings matrices in a factor model. In particular, the predictor-dependent loadings are characterized as a sparse combination of a collection of unknown dictionary functions (e.g, Gaussian process random functions). The induced covariance is then a regularized quadratic function of these dictionary elements. Our proposed framework leads to a highly-flexible, but computationally tractable formulation with simple conjugate posterior updates that can readily handle missing data. Theoretical properties are discussed and the methods are illustrated through simulations studies and an application to the Google Flu Trends data.
A Monte-Carlo AIXI Approximation
Veness, J., Ng, K.S., Hutter, M., Uther, W., Silver, D.
This paper introduces a principled approach for the design of a scalable general reinforcement learning agent. Our approach is based on a direct approximation of AIXI, a Bayesian optimality notion for general reinforcement learning agents. Previously, it has been unclear whether the theory of AIXI could motivate the design of practical algorithms. We answer this hitherto open question in the affirmative, by providing the first computationally feasible approximation to the AIXI agent. To develop our approximation, we introduce a new Monte-Carlo Tree Search algorithm along with an agent-specific extension to the Context Tree Weighting algorithm. Empirically, we present a set of encouraging results on a variety of stochastic and partially observable domains. We conclude by proposing a number of directions for future research.
Inference of global clusters from locally distributed data
We consider the problem of analyzing the heterogeneity of clustering distributions for multiple groups of observed data, each of which is indexed by a covariate value, and inferring global clusters arising from observations aggregated over the covariate domain. We propose a novel Bayesian nonparametric method reposing on the formalism of spatial modeling and a nested hierarchy of Dirichlet processes. We provide an analysis of the model properties, relating and contrasting the notions of local and global clusters. We also provide an efficient inference algorithm, and demonstrate the utility of our method in several data examples, including the problem of object tracking and a global clustering analysis of functional data where the functional identity information is not available.
Minimum mean square distance estimation of a subspace
Besson, Olivier, Dobigeon, Nicolas, Tourneret, Jean-Yves
We consider the problem of subspace estimation in a Bayesian setting. Since we are operating in the Grassmann manifold, the usual approach which consists of minimizing the mean square error (MSE) between the true subspace $U$ and its estimate $\hat{U}$ may not be adequate as the MSE is not the natural metric in the Grassmann manifold. As an alternative, we propose to carry out subspace estimation by minimizing the mean square distance (MSD) between $U$ and its estimate, where the considered distance is a natural metric in the Grassmann manifold, viz. the distance between the projection matrices. We show that the resulting estimator is no longer the posterior mean of $U$ but entails computing the principal eigenvectors of the posterior mean of $U U^{T}$. Derivation of the MMSD estimator is carried out in a few illustrative examples including a linear Gaussian model for the data and a Bingham or von Mises Fisher prior distribution for $U$. In all scenarios, posterior distributions are derived and the MMSD estimator is obtained either analytically or implemented via a Markov chain Monte Carlo simulation method. The method is shown to provide accurate estimates even when the number of samples is lower than the dimension of $U$. An application to hyperspectral imagery is finally investigated.
Global seismic monitoring as probabilistic inference
Arora, Nimar, Russell, Stuart J., Kidwell, Paul, Sudderth, Erik B.
The International Monitoring System (IMS) is a global network of sensors whose purpose is to identify potential violations of the Comprehensive Nuclear-Test-Ban Treaty (CTBT), primarily through detection and localization of seismic events. We report on the first stage of a project to improve on the current automated software system with a Bayesian inference system that computes the most likely global event history given the record of local sensor data. The new system, VISA (Vertically Integrated Seismological Analysis), is based on empirically calibrated, generative models of event occurrence, signal propagation, and signal detection. VISA exhibits significantly improved precision and recall compared to the current operational system and is able to detect events that are missed even by the human analysts who post-process the IMS output.
Construction of Dependent Dirichlet Processes based on Poisson Processes
Lin, Dahua, Grimson, Eric, Fisher, John W.
We present a novel method for constructing dependent Dirichlet processes. The approach exploits the intrinsic relationship between Dirichlet and Poisson processes in order to create a Markov chain of Dirichlet processes suitable for use as a prior over evolving mixture models. The method allows for the creation, removal, and location variation of component models over time while maintaining the property that the random measures are marginally DP distributed. Additionally, we derive a Gibbs sampling algorithm for model inference and test it on both synthetic and real data. Empirical results demonstrate that the approach is effective in estimating dynamically varying mixture models.