Problem Solving
A Divide-and-Conquer Method for Sparse Inverse Covariance Estimation
Hsieh, Cho-jui, Banerjee, Arindam, Dhillon, Inderjit S., Ravikumar, Pradeep K.
In this paper, we consider the $\ell_1$ regularized sparse inverse covariance matrix estimation problem with a very large number of variables. Even in the face of this high dimensionality, and with limited number of samples, recent work has shown this estimator to have strong statistical guarantees in recovering the true structure of the sparse inverse covariance matrix, or alternatively the underlying graph structure of the corresponding Gaussian Markov Random Field. Our proposed algorithm divides the problem into smaller sub-problems, and uses the solutions of the sub-problems to build a good approximation for the original problem. We derive a bound on the distance of the approximate solution to the true solution. Based on this bound, we propose a clustering algorithm that attempts to minimize this bound, and in practice, is able to find effective partitions of the variables.
On the Completeness of First-Order Knowledge Compilation for Lifted Probabilistic Inference
Probabilistic logics are receiving a lot of attention today because of their expressive power for knowledge representation and learning. However, this expressivity is detrimental to the tractability of inference, when done at the propositional level. To solve this problem, various lifted inference algorithms have been proposed that reason at the first-order level, about groups of objects as a whole. Despite the existence of various lifted inference approaches, there are currently no completeness results about these algorithms. The key contribution of this paper is that we introduce a formal definition of lifted inference that allows us to reason about the completeness of lifted inference algorithms relative to a particular class of probabilistic models.
Divide-and-Conquer Matrix Factorization
Mackey, Lester W., Jordan, Michael I., Talwalkar, Ameet
This work introduces Divide-Factor-Combine (DFC), a parallel divide-and-conquer framework for noisy matrix factorization. DFC divides a large-scale matrix factorization task into smaller subproblems, solves each subproblem in parallel using an arbitrary base matrix factorization algorithm, and combines the subproblem solutions using techniques from randomized matrix approximation. Our experiments with collaborative filtering, video background modeling, and simulated data demonstrate the near-linear to super-linear speed-ups attainable with this approach. Moreover, our analysis shows that DFC enjoys high-probability recovery guarantees comparable to those of its base algorithm. Papers published at the Neural Information Processing Systems Conference.
Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search
Guez, Arthur, Silver, David, Dayan, Peter
Bayesian model-based reinforcement learning is a formally elegant approach to learning optimal behaviour under model uncertainty, trading off exploration and exploitation in an ideal way. Unfortunately, finding the resulting Bayes-optimal policies is notoriously taxing, since the search space becomes enormous. In this paper we introduce a tractable, sample-based method for approximate Bayes-optimal planning which exploits Monte-Carlo tree search. Our approach outperformed prior Bayesian model-based RL algorithms by a significant margin on several well-known benchmark problems -- because it avoids expensive applications of Bayes rule within the search tree by lazily sampling models from the current beliefs. We illustrate the advantages of our approach by showing it working in an infinite state space domain which is qualitatively out of reach of almost all previous work in Bayesian exploration.
Expressive Power and Approximation Errors of Restricted Boltzmann Machines
Montufar, Guido F., Rauh, Johannes, Ay, Nihat
We present explicit classes of probability distributions that can be learned by Restricted Boltzmann Machines (RBMs) depending on the number of units that they contain, and which are representative for the expressive power of the model. We use this to show that the maximal Kullback-Leibler divergence to the RBM model with n visible and m hidden units is bounded from above by (n-1)-log(m 1). In this way we can specify the number of hidden units that guarantees a sufficiently rich model containing different classes of distributions and respecting a given error tolerance. Papers published at the Neural Information Processing Systems Conference.
Inductive reasoning about chimeric creatures
Given one feature of a novel animal, humans readily make inferences about other features of the animal. For example, winged creatures often fly, and creatures that eat fish often live in the water. We explore the knowledge that supports these inferences and compare two approaches. The first approach proposes that humans rely on abstract representations of dependency relationships between features, and is formalized here as a graphical model. The second approach proposes that humans rely on specific knowledge of previously encountered animals, and is formalized here as a family of exemplar models.
Poincaré Embeddings for Learning Hierarchical Representations
Nickel, Maximillian, Kiela, Douwe
Representation learning has become an invaluable approach for learning from symbolic data such as text and graphs. However, state-of-the-art embedding methods typically do not account for latent hierarchical structures which are characteristic for many complex symbolic datasets. In this work, we introduce a new approach for learning hierarchical representations of symbolic data by embedding them into hyperbolic space -- or more precisely into an n-dimensional Poincaré ball. Due to the underlying hyperbolic geometry, this allows us to learn parsimonious representations of symbolic data by simultaneously capturing hierarchy and similarity. We present an efficient algorithm to learn the embeddings based on Riemannian optimization and show experimentally that Poincaré embeddings can outperform Euclidean embeddings significantly on data with latent hierarchies, both in terms of representation capacity and in terms of generalization ability. Papers published at the Neural Information Processing Systems Conference.
Synthesizing Robust Plans under Incomplete Domain Models
Nguyen, Tuan A., Kambhampati, Subbarao, Do, Minh
Most current planners assume complete domain models and focus on generating correct plans. Unfortunately, domain modeling is a laborious and error-prone task, thus real world agents have to plan with incomplete domain models. While domain experts cannot guarantee completeness, often they are able to circumscribe the incompleteness of the model by providing annotations as to which parts of the domain model may be incomplete. In such cases, the goal should be to synthesize plans that are robust with respect to any known incompleteness of the domain. In this paper, we first introduce annotations expressing the knowledge of the domain incompleteness and formalize the notion of plan robustness with respect to an incomplete domain model.
Noise-Enhanced Associative Memories
Karbasi, Amin, Salavati, Amir Hesam, Shokrollahi, Amin, Varshney, Lav R.
Recent advances in associative memory design through structured pattern sets and graph-based inference algorithms have allowed reliable learning and recall of an exponential number of patterns. Although these designs correct external errors in recall, they assume neurons that compute noiselessly, in contrast to the highly variable neurons in hippocampus and olfactory cortex. Here we consider associative memories with noisy internal computations and analytically characterize performance. As long as the internal noise level is below a specified threshold, the error probability in the recall phase can be made exceedingly small. More surprisingly, we show that internal noise actually improves the performance of the recall phase.
Correlations strike back (again): the case of associative memory retrieval
Savin, Cristina, Dayan, Peter, Lengyel, Mate
It has long been recognised that statistical dependencies in neuronal activity need to be taken into account when decoding stimuli encoded in a neural population. Less studied, though equally pernicious, is the need to take account of dependencies between synaptic weights when decoding patterns previously encoded in an auto-associative memory. We show that activity-dependent learning generically produces such correlations, and failing to take them into account in the dynamics of memory retrieval leads to catastrophically poor recall. We derive optimal network dynamics for recall in the face of synaptic correlations caused by a range of synaptic plasticity rules. These dynamics involve well-studied circuit motifs, such as forms of feedback inhibition and experimentally observed dendritic nonlinearities.