Inductive Learning
Bootstrapping Distantly Supervised IE Using Joint Learning and Small Well-Structured Corpora
Bing, Lidong (Tencent Inc.) | Dhingra, Bhuwan (Carnegie Mellon University) | Mazaitis, Kathryn (Carnegie Mellon University) | Park, Jong Hyuk (Carnegie Mellon University) | Cohen, William W. (Carnegie Mellon University)
We propose a framework to improve the performance of distantly-supervised relation extraction, by jointly learning to solve two related tasks: concept-instance extraction and relation extraction. We further extend this framework to make a novel use of document structure: in some small, well-structured corpora, sections can be identified that correspond to relation arguments, and distantly-labeled examples from such sections tend to have good precision. Using these as seeds we extract additional relation examples by applying label propagation on a graph composed of noisy examples extracted from a large unstructured testing corpus. Combined with the soft constraint that concept examples should have the same type as the second argument of the relation, we get significant improvements over several state-of-the-art approaches to distantly-supervised relation extraction, and reasonable extraction performance even with very small set of distant labels.
Structured Prediction in Time Series Data
Li, Jia (University of Illinois at Chicago)
Time series data is common in a wide range of disciplines including finance, biology, sociology, and computer science. Analyzing and modeling time series data is fundamental for studying various problems in those fields. For instance, studying time series physiological data can be used to discriminate patientsโ abnormal recovery trajectories and normal ones (Hripcsak, Albers, and Perotte 2015). GPS data are useful for studying collective decision making of groupliving animals (Strandburg-Peshkin et al. 2015). There are different methods for studying time series data such as clustering, regression, and anomaly detection. In this proposal, we are interested in structured prediction problems in time series data. Structured prediction focuses on prediction task where the outputs are structured and interdependent, contrary to the non-structured prediction which assumes that the outputs are independent of other predicted outputs. Structured prediction is an important problem as there are structures inherently existing in time series data. One difficulty for structured prediction is that the number of possible outputs can be exponential which makes modeling all the potential outputs intractable.
Incidental Supervision: Moving beyond Supervised Learning
Roth, Dan (University of Illinois)
Machine Learning and Inference methods have become ubiquitous in our attempt to induce more abstract representations of natural language text, visual scenes, and other messy, naturally occurring data, and support decisions that depend on it. However, learning models for these tasks is difficult partly because generating the necessary supervision signals for it is costly and does not scale. This paper describes several learning paradigms that are designed to alleviate the supervision bottleneck. It will illustrate their benefit in the context of multiple problems, all pertaining to inducing various levels of semantic representations from text. In particular, we discuss (i) esponse Driven Learning of models, a learning protocol that supports inducing meaning representations simply by observing the model's behavior in its environment, (ii) the exploitation of Incidental Supervision signals that exist in the data, independently of the task at hand, to learn models that identify and classify semantic predicates, and (iii) the use of weak supervision to combine simple models to support global decisions where joint supervision is not available.
Inductive Reasoning about Ontologies Using Conceptual Spaces
Bouraoui, Zied (Cardiff University) | Jameel, Shoaib (Cardiff University) | Schockaert, Steven (Cardiff University)
Structured knowledge about concepts plays an increasingly important role in areas such as information retrieval. The available ontologies and knowledge graphs that encode such conceptual knowledge, however, are inevitably incomplete. This observation has led to a number of methods that aim to automatically complete existing knowledge bases. Unfortunately, most existing approaches rely on black box models, e.g. formulated as global optimization problems, which makes it difficult to support the underlying reasoning process with intuitive explanations. In this paper, we propose a new method for knowledge base completion, which uses interpretable conceptual space representations and an explicit model for inductive inference that is closer to human forms of commonsense reasoning. Moreover, by separating the task of representation learning from inductive reasoning, our method is easier to apply in a wider variety of contexts. Finally, unlike optimization based approaches, our method can naturally be applied in settings where various logical constraints between the extensions of concepts need to be taken into account.
Query-Efficient Imitation Learning for End-to-End Simulated Driving
Zhang, Jiakai (New York University) | Cho, Kyunghyun (New York University)
One way to approach end-to-end autonomous driving is to learn a policy that maps from a sensory input, such as an image frame from a front-facing camera, to a driving action, by imitating an expert driver, or a reference policy. This can be done by supervised learning, where a policy is tuned to minimize the difference between the predicted and ground-truth actions. A policy trained in this way however is known to suffer from unexpected behaviours due to the mismatch between the states reachable by the reference policy and trained policy. More advanced algorithms for imitation learning, such as DAgger, addresses this issue by iteratively collecting training examples from both reference and trained policies. These algorithms often require a large number of queries to a reference policy, which is undesirable as the reference policy is often expensive. In this paper, we propose an extension of the DAgger, called SafeDAgger, that is query-efficient and more suitable for end-to-end autonomous driving. We evaluate the proposed SafeDAgger in a car racing simulator and show that it indeed requires less queries to a reference policy. We observe a significant speed up in convergence, which we conjecture to be due to the effect of automated curriculum learning.
Confidence-Rated Discriminative Partial Label Learning
Tang, Cai-Zhi (Southeast University) | Zhang, Min-Ling (Southeast University)
Partial label learning aims to induce a multi-class classifier from training examples where each of them is associated with a set of candidate labels, among which only one label is valid. The common discriminative solution to learn from partial label examples assumes one parametric model for each class label, whose predictions are aggregated to optimize specific objectives such as likelihood or margin over the training examples. Nonetheless, existing discriminative approaches treat the predictions from all parametric models in an equal manner, where the confidence of each candidate label being the ground-truth label is not differentiated. In this paper, a boosting-style partial label learning approach is proposed to enabling confidence-rated discriminative modeling. Specifically, the ground-truth confidence of each candidate label is maintained in each boosting round and utilized to train the base classifier. Extensive experiments on artificial as well as real-world partial label data sets validate the effectiveness of the confidence-rated discriminative modeling.
Querying Partially Labelled Data to Improve a K-nn Classifier
Nguyen, Vu-Linh (University of Technology of Compiegne) | Destercke, Sรฉbastien (University of Technology of Compiegne) | Masson, Marie-Helene (University of Technology of Compiegne and Universite de Picardie Jules Verne)
When learning from instances whose output labels may be partial, the problem of knowing which of these output labels should be made precise to improve the accuracy of predictions arises. This problem can be seen as the intersection of two tasks: the one of learning from partial labels and the one of active learning, where the goal is to provide the labels of additional instances to improve the model accuracy. In this paper, we propose querying strategies of partial labels for the well-known K-nn classifier. We propose different criteria of increasing complexity, using among other things the amount of ambiguity that partial labels introduce in the K-nn decision rule. We then show that our strategies usually outperform simple baseline schemes, and that more complex strategies provide a faster improvement of the model accuracies.
From Shared Subspaces to Shared Landmarks: A Robust Multi-Source Classification Approach
Erfani, Sarah M. (The University of Melbourne) | Baktashmotlagh, Mahsa (Queensland Universityof Technology) | Moshtaghi, Masud (The University of Melbourne) | Nguyen, Vinh (The University of Melbourne) | Leckie, Christopher (The University of Melbourne) | Bailey, James (The University of Melbourne) | Ramamohanarao, Kotagiri (The University of Melbourne)
Training machine leaning algorithms on augmented data fromdifferent related sources is a challenging task. This problemarises in several applications, such as the Internet of Things(IoT), where data may be collected from devices with differentsettings. The learned model on such datasets can generalizepoorly due to distribution bias. In this paper we considerthe problem of classifying unseen datasets, given several labeledtraining samples drawn from similar distributions. Weexploit the intrinsic structure of samples in a latent subspaceand identify landmarks, a subset of training instances fromdifferent sources that should be similar. Incorporating subspacelearning and landmark selection enhances generalizationby alleviating the impact of noise and outliers, as well asimproving efficiency by reducing the size of the data. However,since addressing the two issues simultaneously resultsin an intractable problem, we relax the objective functionby leveraging the theory of nonlinear projection and solve atractable convex optimisation. Through comprehensive analysis,we show that our proposed approach outperforms stateof-the-art results on several benchmark datasets, while keepingthe computational complexity low.
Addressing Imbalance in Multi-Label Classification Using Structured Hellinger Forests
Daniels, Zachary Alan (Rutgers University) | Metaxas, Dimitris N. (Rutgers University)
The multi-label classification problem involves finding a model that maps a set of input features to more than one output label. Class imbalance is a serious issue in multi-label classification. We introduce an extension of structured forests, a type of random forest used for structured prediction, called Sparse Oblique Structured Hellinger Forests (SOSHF). We explore using structured forests in the general multi-label setting and propose a new imbalance-aware formulation by altering how the splitting functions are learned in two ways. First, we account for cost-sensitivity when converting the multi-label problem to a single-label problem at each node in the tree. Second, we introduce a new objective function for determining oblique splits based on the Hellinger distance, a splitting criterion that has been shown to be robust to class imbalance. We empirically validate our method on a number of benchmarks against standard and state-of-the-art multi-label classification algorithms with improved results.
Learning to Act by Predicting the Future
Dosovitskiy, Alexey, Koltun, Vladlen
We present an approach to sensorimotor control in immersive environments. Our approach utilizes a high-dimensional sensory stream and a lower-dimensional measurement stream. The cotemporal structure of these streams provides a rich supervisory signal, which enables training a sensorimotor control model by interacting with the environment. The model is trained using supervised learning techniques, but without extraneous supervision. It learns to act based on raw sensory input from a complex three-dimensional environment. The presented formulation enables learning without a fixed goal at training time, and pursuing dynamically changing goals at test time. We conduct extensive experiments in three-dimensional simulations based on the classical first-person game Doom. The results demonstrate that the presented approach outperforms sophisticated prior formulations, particularly on challenging tasks. The results also show that trained models successfully generalize across environments and goals. A model trained using the presented approach won the Full Deathmatch track of the Visual Doom AI Competition, which was held in previously unseen environments.