Industry
POWERPLAY: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem
Most of computer science focuses on automatically solving given computational problems. I focus on automatically inventing or discovering problems in a way inspired by the playful behavior of animals and humans, to train a more and more general problem solver from scratch in an unsupervised fashion. Consider the infinite set of all computable descriptions of tasks with possibly computable solutions. The novel algorithmic framework POWERPLAY (2011) continually searches the space of possible pairs of new tasks and modifications of the current problem solver, until it finds a more powerful problem solver that provably solves all previously learned tasks plus the new one, while the unmodified predecessor does not. Wow-effects are achieved by continually making previously learned skills more efficient such that they require less time and space. New skills may (partially) re-use previously learned skills. POWERPLAY's search orders candidate pairs of tasks and solver modifications by their conditional computational (time & space) complexity, given the stored experience so far. The new task and its corresponding task-solving skill are those first found and validated. The computational costs of validating new tasks need not grow with task repertoire size. POWERPLAY's ongoing search for novelty keeps breaking the generalization abilities of its present solver. This is related to Goedel's sequence of increasingly powerful formal theories based on adding formerly unprovable statements to the axioms without affecting previously provable theorems. The continually increasing repertoire of problem solving procedures can be exploited by a parallel search for solutions to additional externally posed tasks. POWERPLAY may be viewed as a greedy but practical implementation of basic principles of creativity. A first experimental analysis can be found in separate papers [53,54].
A Goal-Directed Implementation of Query Answering for Hybrid MKNF Knowledge Bases
Gomes, Ana Sofia, Alferes, Jose Julio, Swift, Terrance
Ontologies and rules are usually loosely coupled in knowledge representation formalisms. In fact, ontologies use open-world reasoning while the leading semantics for rules use non-monotonic, closed-world reasoning. One exception is the tightly-coupled framework of Minimal Knowledge and Negation as Failure (MKNF), which allows statements about individuals to be jointly derived via entailment from an ontology and inferences from rules. Nonetheless, the practical usefulness of MKNF has not always been clear, although recent work has formalized a general resolution-based method for querying MKNF when rules are taken to have the well-founded semantics, and the ontology is modeled by a general oracle. That work leaves open what algorithms should be used to relate the entailments of the ontology and the inferences of rules. In this paper we provide such algorithms, and describe the implementation of a query-driven system, CDF-Rules, for hybrid knowledge bases combining both (non-monotonic) rules under the well-founded semantics and a (monotonic) ontology, represented by a CDF Type-1 (ALQ) theory. To appear in Theory and Practice of Logic Programming (TPLP)
Understanding the Interaction between Interests, Conversations and Friendships in Facebook
Ho, Qirong, Yan, Rong, Raina, Rajat, Xing, Eric P.
In this paper, we explore salient questions about user interests, conversations and friendships in the Facebook social network, using a novel latent space model that integrates several data types. A key challenge of studying Facebook's data is the wide range of data modalities such as text, network links, and categorical labels. Our latent space model seamlessly combines all three data modalities over millions of users, allowing us to study the interplay between user friendships, interests, and higher-order network-wide social trends on Facebook. The recovered insights not only answer our initial questions, but also reveal surprising facts about user interests in the context of Facebook's ecosystem. We also confirm that our results are significant with respect to evidential information from the study subjects.
Temporal Autoencoding Restricted Boltzmann Machine
Häusler, Chris, Susemihl, Alex
Much work has been done refining and characterizing the receptive fields learned by deep learning algorithms. A lot of this work has focused on the development of Gabor-like filters learned when enforcing sparsity constraints on a natural image dataset. Little work however has investigated how these filters might expand to the temporal domain, namely through training on natural movies. Here we investigate exactly this problem in established temporal deep learning algorithms as well as a new learning paradigm suggested here, the Temporal Autoencoding Restricted Boltzmann Machine (TARBM).
Transforming Graph Data for Statistical Relational Learning
Rossi, R. A., McDowell, L. K., Aha, D. W., Neville, J.
Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of Statistical Relational Learning (SRL) algorithms to these domains. In this article, we examine and categorize techniques for transforming graph-based relational data to improve SRL algorithms. In particular, appropriate transformations of the nodes, links, and/or features of the data can dramatically affect the capabilities and results of SRL algorithms. We introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. More specifically, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed.
A Tutorial on Dual Decomposition and Lagrangian Relaxation for Inference in Natural Language Processing
Dual decomposition, and more generally Lagrangian relaxation, is a classical method for combinatorial optimization; it has recently been applied to several inference problems in natural language processing (NLP). This tutorial gives an overview of the technique. We describe example algorithms, describe formal guarantees for the method, and describe practical issues in implementing the algorithms. While our examples are predominantly drawn from the NLP literature, the material should be of general relevance to inference problems in machine learning. A central theme of this tutorial is that Lagrangian relaxation is naturally applied in conjunction with a broad class of combinatorial algorithms, allowing inference in models that go significantly beyond previous work on Lagrangian relaxation for inference in graphical models.
Empirical Normalization for Quadratic Discriminant Analysis and Classifying Cancer Subtypes
Kon, Mark A., Nikolaev, Nikolay
We introduce a new discriminant analysis method (Empirical Discriminant Analysis or EDA) for binary classification in machine learning. Given a dataset of feature vectors, this method defines an empirical feature map transforming the training and test data into new data with components having Gaussian empirical distributions. This map is an empirical version of the Gaussian copula used in probability and mathematical finance. The purpose is to form a feature mapped dataset as close as possible to Gaussian, after which standard quadratic discriminants can be used for classification. We discuss this method in general, and apply it to some datasets in computational biology.
Learning Onto-Relational Rules with Inductive Logic Programming
Rules complement and extend ontologies on the Semantic Web. We refer to these rules as onto-relational since they combine DL-based ontology languages and Knowledge Representation formalisms supporting the relational data model within the tradition of Logic Programming and Deductive Databases. Rule authoring is a very demanding Knowledge Engineering task which can be automated though partially by applying Machine Learning algorithms. In this chapter we show how Inductive Logic Programming (ILP), born at the intersection of Machine Learning and Logic Programming and considered as a major approach to Relational Learning, can be adapted to Onto-Relational Learning. For the sake of illustration, we provide details of a specific Onto-Relational Learning solution to the problem of learning rule-based definitions of DL concepts and roles with ILP.
Recognizing Static Signs from the Brazilian Sign Language: Comparing Large-Margin Decision Directed Acyclic Graphs, Voting Support Vector Machines and Artificial Neural Networks
de Souza, César Roberto, Pizzolato, Ednaldo Brigante, Anjo, Mauro dos Santos
In this paper, we explore and detail our experiments in a high-dimensionality, multi-class image classification problem often found in the automatic recognition of Sign Languages. Here, our efforts are directed towards comparing the characteristics, advantages and drawbacks of creating and training Support Vector Machines disposed in a Directed Acyclic Graph and Artificial Neural Networks to classify signs from the Brazilian Sign Language (LIBRAS). We explore how the different heuristics, hyperparameters and multi-class decision schemes affect the performance, efficiency and ease of use for each classifier. We provide hyperparameter surface maps capturing accuracy and efficiency, comparisons between DDAGs and 1-vs-1 SVMs, and effects of heuristics when training ANNs with Resilient Backpropagation. We report statistically significant results using Cohen's Kappa statistic for contingency tables.
The Bayesian Bridge
Polson, Nicholas G., Scott, James G., Windle, Jesse
We propose the Bayesian bridge estimator for regularized regression and classification. Two key mixture representations for the Bayesian bridge model are developed: (1) a scale mixture of normals with respect to an alpha-stable random variable; and (2) a mixture of Bartlett--Fejer kernels (or triangle densities) with respect to a two-component mixture of gamma random variables. Both lead to MCMC methods for posterior simulation, and these methods turn out to have complementary domains of maximum efficiency. The first representation is a well known result due to West (1987), and is the better choice for collinear design matrices. The second representation is new, and is more efficient for orthogonal problems, largely because it avoids the need to deal with exponentially tilted stable random variables. It also provides insight into the multimodality of the joint posterior distribution, a feature of the bridge model that is notably absent under ridge or lasso-type priors. We prove a theorem that extends this representation to a wider class of densities representable as scale mixtures of betas, and provide an explicit inversion formula for the mixing distribution. The connections with slice sampling and scale mixtures of normals are explored. On the practical side, we find that the Bayesian bridge model outperforms its classical cousin in estimation and prediction across a variety of data sets, both simulated and real. We also show that the MCMC for fitting the bridge model exhibits excellent mixing properties, particularly for the global scale parameter. This makes for a favorable contrast with analogous MCMC algorithms for other sparse Bayesian models. All methods described in this paper are implemented in the R package BayesBridge. An extensive set of simulation results are provided in two supplemental files.