Plotting

 Country


Learning the Structure of Deep Sparse Graphical Models

arXiv.org Machine Learning

Deep belief networks are a powerful way to model complex probability distributions. However, learning the structure of a belief network, particularly one with hidden units, is difficult. The Indian buffet process has been used as a nonparametric Bayesian prior on the directed structure of a belief network with a single infinitely wide hidden layer. In this paper, we introduce the cascading Indian buffet process (CIBP), which provides a nonparametric prior on the structure of a layered, directed belief network that is unbounded in both depth and width, yet allows tractable inference. We use the CIBP prior with the nonlinear Gaussian belief network so each unit can additionally vary its behavior between discrete and continuous representations. We provide Markov chain Monte Carlo algorithms for inference in these belief networks and explore the structures learned on several image data sets.


A unifying view for performance measures in multi-class prediction

arXiv.org Machine Learning

In the last few years, many different performance measures have been introduced to overcome the weakness of the most natural metric, the Accuracy. Among them, Matthews Correlation Coefficient has recently gained popularity among researchers not only in machine learning but also in several application fields such as bioinformatics. Nonetheless, further novel functions are being proposed in literature. We show that Confusion Entropy, a recently introduced classifier performance measure for multi-class problems, has a strong (monotone) relation with the multi-class generalization of a classical metric, the Matthews Correlation Coefficient. Computational evidence in support of the claim is provided, together with an outline of the theoretical explanation.


PMOG: The projected mixture of Gaussians model with application to blind source separation

arXiv.org Artificial Intelligence

We extend the mixtures of Gaussians (MOG) model to the projected mixture of Gaussians (PMOG) model. In the PMOG model, we assume that q dimensional input data points z_i are projected by a q dimensional vector w into 1-D variables u_i. The projected variables u_i are assumed to follow a 1-D MOG model. In the PMOG model, we maximize the likelihood of observing u_i to find both the model parameters for the 1-D MOG as well as the projection vector w. First, we derive an EM algorithm for estimating the PMOG model. Next, we show how the PMOG model can be applied to the problem of blind source separation (BSS). In contrast to conventional BSS where an objective function based on an approximation to differential entropy is minimized, PMOG based BSS simply minimizes the differential entropy of projected sources by fitting a flexible MOG model in the projected 1-D space while simultaneously optimizing the projection vector w. The advantage of PMOG over conventional BSS algorithms is the more flexible fitting of non-Gaussian source densities without assuming near-Gaussianity (as in conventional BSS) and still retaining computational feasibility.


Mining tree-query associations in graphs

arXiv.org Artificial Intelligence

New applications of data mining, such as in biology, bioinformatics, or sociology, are faced with large datasetsstructured as graphs. We introduce a novel class of tree-shapedpatterns called tree queries, and present algorithms for miningtree queries and tree-query associations in a large data graph. Novel about our class of patterns is that they can containconstants, and can contain existential nodes which are not counted when determining the number of occurrences of the patternin the data graph. Our algorithms have a number of provableoptimality properties, which are based on the theory of conjunctive database queries. We propose a practical, database-oriented implementation in SQL, and show that the approach works in practice through experiments on data about food webs, protein interactions, and citation analysis.


Epistemic irrelevance in credal nets: the case of imprecise Markov trees

arXiv.org Artificial Intelligence

We focus on credal nets, which are graphical models that generalise Bayesian nets to imprecise probability. We replace the notion of strong independence commonly used in credal nets with the weaker notion of epistemic irrelevance, which is arguably more suited for a behavioural theory of probability. Focusing on directed trees, we show how to combine the given local uncertainty models in the nodes of the graph into a global model, and we use this to construct and justify an exact message-passing algorithm that computes updated beliefs for a variable in the tree. The algorithm, which is linear in the number of nodes, is formulated entirely in terms of coherent lower previsions, and is shown to satisfy a number of rationality requirements. We supply examples of the algorithm's operation, and report an application to on-line character recognition that illustrates the advantages of our approach for prediction. We comment on the perspectives, opened by the availability, for the first time, of a truly efficient algorithm based on epistemic irrelevance.


Faithfulness in Chain Graphs: The Gaussian Case

arXiv.org Artificial Intelligence

This paper deals with chain graphs under the classic Lauritzen-Wermuth-Frydenberg interpretation. We prove that the regular Gaussian distributions that factorize with respect to a chain graph $G$ with $d$ parameters have positive Lebesgue measure with respect to $\mathbb{R}^d$, whereas those that factorize with respect to $G$ but are not faithful to it have zero Lebesgue measure with respect to $\mathbb{R}^d$. This means that, in the measure-theoretic sense described, almost all the regular Gaussian distributions that factorize with respect to $G$ are faithful to it.


RDFViewS: A Storage Tuning Wizard for RDF Applications

arXiv.org Artificial Intelligence

In recent years, the significant growth of RDF data used in numerous applications has made its efficient and scalable manipulation an important issue. In this paper, we present RDFViewS, a system capable of choosing the most suitable views to materialize, in order to minimize the query response time for a specific SPARQL query workload, while taking into account the view maintenance cost and storage space constraints. Our system employs practical algorithms and heuristics to navigate through the search space of potential view configurations, and exploits the possibly available semantic information - expressed via an RDF Schema - to ensure the completeness of the query evaluation.


The effect of discrete vs. continuous-valued ratings on reputation and ranking systems

arXiv.org Artificial Intelligence

When users rate objects, a sophisticated algorithm that takes into account ability or reputation may produce a fairer or more accurate aggregation of ratings than the straightforward arithmetic average. Recently a number of authors have proposed different co-determination algorithms where estimates of user and object reputation are refined iteratively together, permitting accurate measures of both to be derived directly from the rating data. However, simulations demonstrating these methods' efficacy assumed a continuum of rating values, consistent with typical physical modelling practice, whereas in most actual rating systems only a limited range of discrete values (such as a 5-star system) is employed. We perform a comparative test of several co-determination algorithms with different scales of discrete ratings and show that this seemingly minor modification in fact has a significant impact on algorithms' performance. Paradoxically, where rating resolution is low, increased noise in users' ratings may even improve the overall performance of the system.


Large Scale Variational Inference and Experimental Design for Sparse Generalized Linear Models

arXiv.org Machine Learning

Many problems of low-level computer vision and image processing, such as denoising, deconvolution, tomographic reconstruction or super-resolution, can be addressed by maximizing the posterior distribution of a sparse linear model (SLM). We show how higher-order Bayesian decision-making problems, such as optimizing image acquisition in magnetic resonance scanners, can be addressed by querying the SLM posterior covariance, unrelated to the density's mode. We propose a scalable algorithmic framework, with which SLM posteriors over full, high-resolution images can be approximated for the first time, solving a variational optimization problem which is convex iff posterior mode finding is convex. These methods successfully drive the optimization of sampling trajectories for real-world magnetic resonance imaging through Bayesian experimental design, which has not been attempted before. Our methodology provides new insight into similarities and differences between sparse reconstruction and approximate Bayesian inference, and has important implications for compressive sensing of real-world images.


Discovering shared and individual latent structure in multiple time series

arXiv.org Artificial Intelligence

This paper proposes a nonparametric Bayesian method for exploratory data analysis and feature construction in continuous time series. Our method focuses on understanding shared features in a set of time series that exhibit significant individual variability. Our method builds on the framework of latent Diricihlet allocation (LDA) and its extension to hierarchical Dirichlet processes, which allows us to characterize each series as switching between latent ``topics'', where each topic is characterized as a distribution over ``words'' that specify the series dynamics. However, unlike standard applications of LDA, we discover the words as we learn the model. We apply this model to the task of tracking the physiological signals of premature infants; our model obtains clinically significant insights as well as useful features for supervised learning tasks.