Goto

Collaborating Authors

 Uncertainty


Tracking Creative Musical Structure: The Hunt for the Intrinsically Motivated Generative Agent

AAAI Conferences

Neural networks have been employed to learn, generalize, and generate musical pieces with a constrained notion of creativity. Yet, these computational models typically suffer from an inability to characterize and reproduce long-term dependencies indicative of musical structure. Hierarchical and deep learning models propose to remedy this deficiency, but remain to be adequately proven. We describe and examine a novel dynamic bayesian network model with the goal of learning and reproducing longer-term formal musical structures. Incorporating a computational model of intrinsic motivation and novelty, this hierarchical probabilistic model is able to generate pastiches based on exemplars.


Learning Gaussian Graphical Models with Observed or Latent FVSs

arXiv.org Machine Learning

Gaussian Graphical Models (GGMs) or Gauss Markov random fields are widely used in many applications, and the trade-off between the modeling capacity and the efficiency of learning and inference has been an important research problem. In this paper, we study the family of GGMs with small feedback vertex sets (FVSs), where an FVS is a set of nodes whose removal breaks all the cycles. Exact inference such as computing the marginal distributions and the partition function has complexity $O(k^{2}n)$ using message-passing algorithms, where k is the size of the FVS, and n is the total number of nodes. We propose efficient structure learning algorithms for two cases: 1) All nodes are observed, which is useful in modeling social or flight networks where the FVS nodes often correspond to a small number of high-degree nodes, or hubs, while the rest of the networks is modeled by a tree. Regardless of the maximum degree, without knowing the full graph structure, we can exactly compute the maximum likelihood estimate in $O(kn^2+n^2\log n)$ if the FVS is known or in polynomial time if the FVS is unknown but has bounded size. 2) The FVS nodes are latent variables, where structure learning is equivalent to decomposing a inverse covariance matrix (exactly or approximately) into the sum of a tree-structured matrix and a low-rank matrix. By incorporating efficient inference into the learning steps, we can obtain a learning algorithm using alternating low-rank correction with complexity $O(kn^{2}+n^{2}\log n)$ per iteration. We also perform experiments using both synthetic data as well as real data of flight delays to demonstrate the modeling capacity with FVSs of various sizes.


Pattern-Coupled Sparse Bayesian Learning for Recovery of Block-Sparse Signals

arXiv.org Machine Learning

We consider the problem of recovering block-sparse signals whose structures are unknown \emph{a priori}. Block-sparse signals with nonzero coefficients occurring in clusters arise naturally in many practical scenarios. However, the knowledge of the block structure is usually unavailable in practice. In this paper, we develop a new sparse Bayesian learning method for recovery of block-sparse signals with unknown cluster patterns. Specifically, a pattern-coupled hierarchical Gaussian prior model is introduced to characterize the statistical dependencies among coefficients, in which a set of hyperparameters are employed to control the sparsity of signal coefficients. Unlike the conventional sparse Bayesian learning framework in which each individual hyperparameter is associated independently with each coefficient, in this paper, the prior for each coefficient not only involves its own hyperparameter, but also the hyperparameters of its immediate neighbors. In doing this way, the sparsity patterns of neighboring coefficients are related to each other and the hierarchical model has the potential to encourage structured-sparse solutions. The hyperparameters, along with the sparse signal, are learned by maximizing their posterior probability via an expectation-maximization (EM) algorithm. Numerical results show that the proposed algorithm presents uniform superiority over other existing methods in a series of experiments.


Adaptive Measurement-Based Policy-Driven QoS Management with Fuzzy-Rule-based Resource Allocation

arXiv.org Artificial Intelligence

Fixed and wireless networks are increasingly converging towards common connectivity with IP-based core networks. Providing effective end-to-end resource and QoS management in such complex heterogeneous converged network scenarios requires unified, adaptive and scalable solutions to integrate and co-ordinate diverse QoS mechanisms of different access technologies with IP-based QoS. Policy-Based Network Management (PBNM) is one approach that could be employed to address this challenge. Hence, a policy-based framework for end-to-end QoS management in converged networks, CNQF (Converged Networks QoS Management Framework) has been proposed within our project. In this paper, the CNQF architecture, a Java implementation of its prototype and experimental validation of key elements are discussed. We then present a fuzzy-based CNQF resource management approach and study the performance of our implementation with real traffic flows on an experimental testbed. The results demonstrate the efficacy of our resource-adaptive approach for practical PBNM systems.


Pseudo-likelihood methods for community detection in large sparse networks

arXiv.org Machine Learning

Many algorithms have been proposed for fitting network models with communities, but most of them do not scale well to large networks, and often fail on sparse networks. Here we propose a new fast pseudo-likelihood method for fitting the stochastic block model for networks, as well as a variant that allows for an arbitrary degree distribution by conditioning on degrees. We show that the algorithms perform well under a range of settings, including on very sparse networks, and illustrate on the example of a network of political blogs. We also propose spectral clustering with perturbations, a method of independent interest, which works well on sparse networks where regular spectral clustering fails, and use it to provide an initial value for pseudo-likelihood. We prove that pseudo-likelihood provides consistent estimates of the communities under a mild condition on the starting value, for the case of a block model with two communities.


Thompson Sampling for Complex Bandit Problems

arXiv.org Machine Learning

We consider stochastic multi-armed bandit problems with complex actions over a set of basic arms, where the decision maker plays a complex action rather than a basic arm in each round. The reward of the complex action is some function of the basic arms' rewards, and the feedback observed may not necessarily be the reward per-arm. For instance, when the complex actions are subsets of the arms, we may only observe the maximum reward over the chosen subset. Thus, feedback across complex actions may be coupled due to the nature of the reward function. We prove a frequentist regret bound for Thompson sampling in a very general setting involving parameter, action and observation spaces and a likelihood function over them. The bound holds for discretely-supported priors over the parameter space and without additional structural properties such as closed-form posteriors, conjugate prior structure or independence across arms. The regret bound scales logarithmically with time but, more importantly, with an improved constant that non-trivially captures the coupling across complex actions due to the structure of the rewards. As applications, we derive improved regret bounds for classes of complex bandit problems involving selecting subsets of arms, including the first nontrivial regret bounds for nonlinear MAX reward feedback from subsets.


Multivariate Generalized Gaussian Process Models

arXiv.org Machine Learning

We propose a family of multivariate Gaussian process models for correlated outputs, based on assuming that the likelihood function takes the generic form of the multivariate exponential family distribution (EFD). We denote this model as a multivariate generalized Gaussian process model, and derive Taylor and Laplace algorithms for approximate inference on the generic model. By instantiating the EFD with specific parameter functions, we obtain two novel GP models (and corresponding inference algorithms) for correlated outputs: 1) a Von-Mises GP for angle regression; and 2) a Dirichlet GP for regressing on the multinomial simplex.


Parsimonious Shifted Asymmetric Laplace Mixtures

arXiv.org Machine Learning

A family of parsimonious shifted asymmetric Laplace mixture models is introduced. We extend the mixture of factor analyzers model to the shifted asymmetric Laplace distribution. Imposing constraints on the constitute parts of the resulting decomposed component scale matrices leads to a family of parsimonious models. An explicit two-stage parameter estimation procedure is described, and the Bayesian information criterion and the integrated completed likelihood are compared for model selection. This novel family of models is applied to real data, where it is compared to its Gaussian analogue within clustering and classification paradigms.


Bayesian inference as iterated random functions with applications to sequential inference in graphical models

arXiv.org Machine Learning

The sequential posterior updates play a central role in many Bayesian inference procedures. As an example, in Bayesian inference one is interested in the posterior probability of variables of interest given the data observed sequentially up to a given time point. As a more specific example which provides the motivation for this work, in a sequential change point detection problem [1], the key quantity is the posterior probability that a change has occurred given the data observed up to present time. When the underlying probability model is complex, e.g., a large-scale graphical model, the calculation of such quantities in a fast and online manner is a formidable challenge. In such situations approximate inference methods are required - for graphical models, message-passing variational inference algorithms present a viable option [2, 3].


A dependent partition-valued process for multitask clustering and time evolving network modelling

arXiv.org Machine Learning

The fundamental aim of clustering algorithms is to partition data points. We consider tasks where the discovered partition is allowed to vary with some covariate such as space or time. One approach would be to use fragmentation-coagulation processes, but these, being Markov processes, are restricted to linear or tree structured covariate spaces. We define a partition-valued process on an arbitrary covariate space using Gaussian processes. We use the process to construct a multitask clustering model which partitions datapoints in a similar way across multiple data sources, and a time series model of network data which allows cluster assignments to vary over time. We describe sampling algorithms for inference and apply our method to defining cancer subtypes based on different types of cellular characteristics, finding regulatory modules from gene expression data from multiple human populations, and discovering time varying community structure in a social network.