Goto

Collaborating Authors

 Supervised Learning


Adaptive Metric Dimensionality Reduction

arXiv.org Machine Learning

We study adaptive data-dependent dimensionality reduction in the context of supervised learning in general metric spaces. Our main statistical contribution is a generalization bound for Lipschitz functions in metric spaces that are doubling, or nearly doubling. On the algorithmic front, we describe an analogue of PCA for metric spaces: namely an efficient procedure that approximates the data's intrinsic dimension, which is often much lower than the ambient dimension. Our approach thus leverages the dual benefits of low dimensionality: (1) more efficient algorithms, e.g., for proximity search, and (2) more optimistic generalization bounds.


Compositional Vector Space Models for Knowledge Base Inference

AAAI Conferences

Traditional approaches to knowledge base completion have been based on symbolic representations. Low-dimensional vector embedding models proposed recently for this task are attractive since they generalize to possibly unlimited sets of relations. A significant draw- back of previous embedding models for KB completion is that they merely support reasoning on individual relations (e.g., bornIn ( X, Y ) โ‡’ nationality ( X, Y ) ). In this work, we develop models for KB completion that support chains of reasoning on paths of any length using compositional vector space models. We construct compositional vector representations for the paths in the KB graph from the semantic vector representations of the binary relations in that path and perform inference directly in the vector space. Unlike previous methods, our approach can generalize to paths that are unseen in training and, in a zero-shot setting, predict target relations without supervised training data for that relation.


Combining Vector Space Embeddings with Symbolic Logical Inference over Open-Domain Text

AAAI Conferences

We have recently shown how to combine random walk inference over knowledge bases with vector space representations of surface forms, improving performance on knowledge base inference. In this paper, we formalize the connection of our prior work to logical inference rules, giving some general observations about methods for incorporating vector space representations into symbolic logic systems. Additionally, we present some promising preliminary work that extends these techniques to learning open-domain relations for the purpose of answering multiple choice questions, achieving 67% accuracy on a small test set.


On Approximate Reasoning Capabilities of Low-Rank Vector Spaces

AAAI Conferences

In relational databases, relations between objects, represented by binary matrices or tensors, may be arbitrarily complex. In practice however, there are recurring relational patterns such as transitive, permutation and sequential relationships, that seem to have a regular structure not captured by the classical notion of matrix rank or tensor rank. In this paper, we show that factorizing the relational tensor using a logistic or hinge loss instead of the more standard squared loss is more appropriate because it can accurately model many common relations with a fixed-size embedding that depends sub-linearly on the number of entities in the knowledge base. We illustrate this fact empirically by being able to efficiently predict missing links in several synthetic and real-world experiments. Further, we provide theoretical justification for logistic loss by studying its connection to a complexity measure from the field of information complexity called the sign rank. Sign rank is a more appropriate complexity measure as it has a low value for transitive, permutation, or sequential relationships, while being large for uniformly sampled binary matrices/tensors with a high probability.


AffectiveSpace 2: Enabling Affective Intuition for Concept-Level Sentiment Analysis

AAAI Conferences

Predicting the affective valence of unknown multi-word expressions is key for concept-level sentiment analysis. AffectiveSpace 2 is a vector space model, built by means of random projection, that allows for reasoning by analogy on natural language con- cepts. By reducing the dimensionality of affec- tive common-sense knowledge, the model allows semantic features associated with concepts to be generalized and, hence, allows concepts to be intu- itively clustered according to their semantic and affective relatedness. Such an affective intuition (so called because it does not rely on explicit fea- tures, but rather on implicit analogies) enables the inference of emotions and polarity conveyed by multi-word expressions, thus achieving efficient concept-level sentiment analysis.


Structural Learning with Amortized Inference

AAAI Conferences

Training a structured prediction model involves performing several loss-augmented inference steps. Over the lifetime of the training, many of these inference problems, although different, share the same solution. We propose AI-DCD, an Amortized Inference framework for Dual Coordinate Descent method, an approximate learning algorithm, that accelerates the training process by exploiting this redundancy of solutions, without compromising the performance of the model. We show the efficacy of our method by training a structured SVM using dual coordinate descent for an entityrelation extraction task. Our method learns the same model as an exact training algorithm would, but call the inference engine only in 10% โ€“ 24% of the inference problems encountered during training. We observe similar gains on a multi-label classification task and with a Structured Perceptron model for the entity-relation task.


Crowdsourced Action-Model Acquisition for Planning

AAAI Conferences

AI planning techniques often require a given set of action models provided as input. Creating action models is, however, a difficult task that costs much manual effort. The problem of action-model acquisition has drawn a lot of interest from researchers in the past. Despite the success of the previous systems, they are all based on the assumption that there are enough training examples for learning high-quality action models. In many real-world applications, e.g., military operation, collecting a large amount of training examples is often both difficult and costly. Instead of collecting training examples, we assume there are abundant annotators, i.e., the crowd, available to provide information learning action models. Specifically, we first build a set of soft constraints based on the labels (true or false) given by the crowd or annotators. We then builds a set of soft constraints based on the input plan traces. After that we put all the constraints together and solve them using a weighted MAX-SAT solver, and convert the solution of the solver to action models. We finally exhibit that our approach is effective in the experiment.


Learning Greedy Policies for the Easy-First Framework

AAAI Conferences

Easy-first, a search-based structured prediction approach, has been applied to many NLP tasks including dependency parsing and coreference resolution. This approach employs a learned greedy policy (action scoring function) to make easy decisions first, which constrains the remaining decisions and makes them easier. We formulate greedy policy learning in the Easy-first approach as a novel non-convex optimization problem and solve it via an efficient Majorization Minimizatoin (MM) algorithm. Results on within-document coreference and cross-document joint entity and event coreference tasks demonstrate that the proposed approach achieves statistically significant performance improvement over existing training regimes for Easy-first and is less susceptible to overfitting.


GENERALIZATION As SEARCH / 517 Generalization as Search

AI Classics

We learn (memorize) multiplication tables, learn (discover how) to walk, learn (build UP an understanding of, then an ability to synthesize) languages. Many subtasks and capabilities are involved in these various kinds of learning. One capability central to many kinds of learning is the ability to generalize: to take into account a large number of specific observations, then to extract and retain the important common features that characterize classes of these observations. This generalization problem has received considerable attention for two decades in the fields of Artificial Intelligence, Psychology, and Pattern Recognition (e.g., [Bruner, 1956],


MACHINE INTELLIGENCE 13

AI Classics

The two outstanding figures in the history of computer science are Alan Turing and John von Neumann, and they shared the view that logic was the key to understanding and automating computation. In particular, it was Turing who gave us in the mid-1930s the fundamental analysis, and the logical definition, of the concept of'computability by machine' and who discovered the surprising and beautiful basic fact that there exist universal machines which by suitable programming can be made to t This essay is an expanded and revised version of one entitled The Role of Logic in Computer Science and Artificial Intelligence, which was completed in January 1992 (and was later published in the Proceedings of the Fifth Generation computer Systems 1992 Conference). Since completing that essay I have had the benefit of extremely helpful discussions on many of the details with Professor Donald Michie and Professor I. J. Good, both of whom knew Turing well during the war years at Bletchley Park. Professor J. A. N. Lee, whose knowledge of the literature and archives of the history of computing is encyclopedic, also provided additional information, some of which is still unpublished. Further light has very recently been shed on the von Neumann side of the story by Norman Macrae's excellent biography John von Neumann (Macrae 1992). Accordingly, it seemed appropriate to undertake a more complete and thorough version of the FGCS'92 essay, focussing somewhat more on the interesting historical and biographical issues. I am grateful to Donald Michie and Stephen Muggleton for inviting me to contribute such a'second edition' to the present volume, and I would also like to thank the Institute for New Computer Technology (ICOT) for kind permission to make use of the FGCS'92 essay in this way. 1 LOGIC, COMPUTERS, TURING, AND VON NEUMANN