Education
Learning Tasks and Skills Together From a Human Teacher
Akgun, Baris (Georgia Institute of Technology) | Subramanian, Kaushik (Georgia Institute of Technology) | Shim, Jaeeun (Georgia Institute of Technology) | Thomaz, Andrea Lockerd (Georgia Institute of Technology)
Robot Learning from Demonstration (LfD) research deals with the challenges of enabling humans to teach robots novel skills and tasks (Argall et al. 2009). The practical importance of LfD is due to the fact that it is impossible to pre-program all the necessary skills and task knowledge that a robot might need during its life-cycle. This poses many interesting application areas for LfD ranging from houses to factory floors. An important motivation for our research agenda is that in many of the practical LfD applications, the teacher will be an everyday end-user, not an expert in Machine Learning or robotics. Thus, our research explores the ways in which Machine Learning can exploit human social learning interactions--Socially Guided Machine Learning (SGML).
A Framework for Integration of Logical and Probabilistic Knowledge
Wang, Jingsong (University of South Carolina) | Valtorta, Marco (University of South Carolina)
Integrating the expressive power of first-order logic with the ability of probabilistic reasoning of Bayesian networks has attracted the interest of many researchers for decades. We present an approach to integration that translates logical knowledge into Bayesian networks and uses Bayesian network composition to build a uniform representation that supports both logical and probabilistic reasoning. In particular, we propose a new way of translation of logical knowledge, relation search. Through the use of the proposed framework, without learning new languages or tools, modelers are allowed to 1) specify special knowledge using the most suitable languages, while reasoning in a uniform engine; 2) make use of pre-existing logical knowledge bases for probabilistic reasoning (to complete the model or minimize potential inconsistencies).
Two Visual Strategies for Solving the Raven’s Progressive Matrices Intelligence Test
Kunda, Maithilee (Georgia Institute of Technology) | McGreggor, Keith (Georgia Institute of Technology) | Goel, Ashok (Georgia Institute of Technology)
We present two visual algorithms, called the affine and fractal methods, which each solve a considerable portion of the Raven’s Progressive Matrices (RPM) test. The RPM is considered to be one of the premier psychometric measures of general intelligence. Current computational accounts of the RPM assume that visual test inputs are translated into propositional representations before further reasoning takes place. We propose that visual strategies can also solve RPM problems, in line with behavioral evidence showing that humans do use visual strategies to some extent on the RPM. Our two visual methods currently solve RPM problems at the level of typical 9- to 10-year-olds.
Contextually-Based Utility: An Appraisal-Based Approach at Modeling Framing and Decisions
Ito, Jonathan Yasuo (University of Southern California) | Marsella, Stacy (University of Southern California)
Creating accurate computational models of human decision making is a vital step towards the realization of socially intelligent systems capable of both predicting and simulating human behavior. In modeling human decision making, a key factor is the psychological phenomenon known as "framing", in which the preferences of a decision maker change in response to contextual changes in decision problems. Existing approaches treat framing as a one-dimensional contextual influence based on the perception of outcomes as either gains or losses. However, empirical studies have shown that framing effects are much more multifaceted than one-dimensional views of framing suggest. To address this limitation, we propose an integrative approach to modeling framing which combines the psychological principles of cognitive appraisal theories and decision-theoretic notions of utility and probability. We show that this approach allows for both the identification and computation of the salient contextual factors in a decision as well as modeling how they ultimately affect the decision process. Furthermore, we show that our multi-dimensional, appraisal-based approach can account for framing effects identified in the empirical literature which cannot be addressed by one-dimensional theories, thereby promising more accurate models of human behavior.
Heterogeneous Transfer Learning for Image Classification
Zhu, Yin (Hong Kong University of Science and Technology) | Chen, Yuqiang (Shanghai Jiao Tong University) | Lu, Zhongqi (†Hong Kong University of Science and Technology) | Pan, Sinno Jialin (Institute for Infocomm Research) | Xue, Gui-Rong (Shanghai Jiao Tong University) | Yu, Yong (Shanghai Jiao Tong University) | Yang, Qiang (Hong Kong University of Science and Technology)
Transfer learning as a new machine learning paradigm has gained increasing attention lately. In situations where the training data in a target domain are not sufficient to learn predictive models effectively, transfer learning leverages auxiliary source data from other related source domains for learning. While most of the existing works in this area only focused on using the source data with the same structure as the target data, in this paper, we push this boundary further by proposing a heterogeneous transfer learning framework for knowledge transfer between text and images. We observe that for a target-domain classification problem, some annotated images can be found on many social Web sites, which can serve as a bridge to transfer knowledge from the abundant text documents available over the Web. A key question is how to effectively transfer the knowledge in the source data even though the text can be arbitrarily found. Our solution is to enrich the representation of the target images with semantic concepts extracted from the auxiliary source data through a novel matrix factorization method. By using the latent semantic features generated by the auxiliary data, we are able to build a better integrated image classifier. We empirically demonstrate the effectiveness of our algorithm on the Caltech-256 image dataset.
Grammatical Error Detection for Corrective Feedback Provision in Oral Conversations
Lee, Sungjin (Pohang University of Science and Technology (POSTECH)) | Noh, Hyungjong (Pohang University of Science and Technology (POSTECH)) | Lee, Kyusong (Pohang University of Science and Technology (POSTECH)) | Lee, Gary Geunbae (Pohang University of Science and Technology (POSTECH))
The demand for computer-assisted language learning systems that can provide corrective feedback on language learners’ speaking has increased. However, it is not a trivial task to detect grammatical errors in oral conversations because of the unavoidable errors of automatic speech recognition systems. To provide corrective feedback, a novel method to detect grammatical errors in speaking performance is proposed. The proposed method consists of two sub-models: the grammaticality-checking model and the error-type classification model. We automatically generate grammatical errors that learners are likely to commit and construct error patterns based on the articulated errors. When a particular speech pattern is recognized, the grammaticality-checking model performs a binary classification based on the similarity between the error patterns and the recognition result using the confidence score. The error-type classification model chooses the error type based on the most similar error pattern and the error frequency extracted from a learner corpus. The grammaticality checking method largely outperformed the two comparative models by 56.36% and 42.61% in F-score while keeping the false positive rate very low. The error-type classification model exhibited very high performance with a 99.6% accuracy rate. Because high precision and a low false positive rate are important criteria for the language-tutoring setting, the proposed method will be helpful for intelligent computer-assisted language learning systems.
Convex Sparse Coding, Subspace Learning, and Semi-Supervised Extensions
Zhang, Xinhua (University of Alberta) | Yu, Yaoliang (University of Alberta) | White, Martha (University of Alberta) | Huang, Ruitong (University of Alberta) | Schuurmans, Dale (University of Alberta)
Automated feature discovery is a fundamental problem in machine learning. Although classical feature discovery methods do not guarantee optimal solutions in general, it has been recently noted that certain subspace learning and sparse coding problems can be solved efficiently, provided the number of features is not restricted a priori. We provide an extended characterization of this optimality result and describe the nature of the solutions under an expanded set of practical contexts. In particular, we apply the framework to a semi-supervised learning problem, and demonstrate that feature discovery can co-occur with input reconstruction and supervised training while still admitting globally optimal solutions. A comparison to existing semi-supervised feature discovery methods shows improved generalization and efficiency.
Fast Newton-CG Method for Batch Learning of Conditional Random Fields
Tsuboi, Yuta (IBM Research - Tokyo) | Unno, Yuya (Preferred Infrastructure, Inc.) | Kashima, Hisashi (The University of Tokyo) | Okazaki, Naoaki (Tohoku University)
We propose a fast batch learning method for linear-chain Conditional Random Fields (CRFs) based on Newton-CG methods. Newton-CG methods are a variant of Newton method for high-dimensional problems. They only require the Hessian-vector products instead of the full Hessian matrices. To speed up Newton-CG methods for the CRF learning, we derive a novel dynamic programming procedure for the Hessian-vector products of the CRF objective function. The proposed procedure can reuse the byproducts of the time-consuming gradient computation for the Hessian-vector products to drastically reduce the total computation time of the Newton-CG methods. In experiments with tasks in natural language processing, the proposed method outperforms a conventional quasi-Newton method. Remarkably, the proposed method is competitive with online learning algorithms that are fast but unstable.
A Generalised Solution to the Out-of-Sample Extension Problem in Manifold Learning
Strange, Harry (Aberystwyth University) | Zwiggelaar, Reyer (Aberystwyth University)
Manifold learning is a powerful tool for reducing the dimensionality of a dataset by finding a low-dimensional embedding that retains important geometric and topological features. In many applications it is desirable to add new samples to a previously learnt embedding, this process of adding new samples is known as the out-of-sample extension problem. Since many manifold learning algorithms do not naturally allow for new samples to be added we present an easy to implement generalized solution to the problem that can be used with any existing manifold learning algorithm. Our algorithm is based on simple geometric intuition about the local structure of a manifold and our results show that it can be effectively used to add new samples to a previously learnt embedding. We test our algorithm on both artificial and real world image data and show that our method significantly out performs existing out-of-sample extension strategies.
Markov Logic Sets: Towards Lifted Information Retrieval Using PageRank and Label Propagation
Neumann, Marion (Fraunhofer IAIS) | Ahmadi, Babak (Fraunhofer IAIS) | Kersting, Kristian (Fraunhofer IAIS)
Inspired by “GoogleTM Sets” and Bayesian sets, we consider the problem of retrieving complex objects and relations among them, i.e., ground atoms from a logical concept, given a query consisting of a few atoms from that concept. We formulate this as a within-network relational learning problem using few labels only and describe an algorithm that ranks atoms using a score based on random walks with restart (RWR): the probability that a random surfer hits an atom starting from the query atoms. Specifically, we compute an initial ranking using personalized PageRank. Then, we find paths of atoms that are connected via their arguments, variablize the ground atoms in each path, in order to create features for the query. These features are used to re-personalize the original RWR and to finally compute the set completion, based on Label Propagation. Moreover, we exploit that RWR techniques can naturally be lifted and show that lifted inference for label propagation is possible. We evaluate our algorithm on a realworld relational dataset by finding completions of sets of objects describing the Roman city of Pompeii. We compare to Bayesian sets and show that our approach gives very reasonable set completions.