Inductive Learning
Maximum Entropy Context Models for Ranking Biographical Answers to Open-Domain Definition Questions
Figueroa, Alejandro (Yahoo! Research Latin America) | Atkinson, John (Universidad de Concepcion)
In the context of question-answering systems, there are several strategies for scoring candidate answers to definition queries including centroid vectors, bi-term and context language models. These techniques use only positive examples (i.e., descriptions) when building their models. In this work, a maximum entropy based extension is proposed for context language models so as to account for regularities across non-descriptions mined from web-snippets. Experiments show that this extension outperforms other strategies increasing the precision of the top five ranked answers by more than 5%. Results suggest that web-snippets are a cost-efficient source of non-descriptions, and that some relationships extracted from dependency trees are effective to mine for candidate answer sentences.
Partially Supervised Text Classification with Multi-Level Examples
Liu, Tao (Renmin University of China) | Du, Xiaoyong (Renmin University of China) | Xu, Yongdong (Harbin Institute of Technology) | Li, Minghui (Microsoft) | Wang, Xiaolong (Harbin Institute of Technology)
Partially supervised text classification has received great research attention since it only uses positive and unlabeled examples as training data. This problem can be solved by automatically labeling some negative (and more positive) examples from unlabeled examples before training a text classifier. But it is difficult to guarantee both high quality and quantity of the new labeled examples. In this paper, a multi-level example based learning method for partially supervised text classification is proposed, which can make full use of all unlabeled examples. A heuristic method is proposed to assign possible labels to unlabeled examples and partition them into multiple levels according to their labeling confidence. A text classifier is trained on these multi-level examples using weighted support vector machines. Experiments show that the multi-level example based learning method is effective for partially supervised text classification, and outperforms the existing popular methods such as Biased-SVM, ROC-SVM, S-EM and WL.
Heterogeneous Transfer Learning with RBMs
Wei, Bin (University of Rochester) | Pal, Christopher (Ecole Polytechnique de Montreal)
A common approach in machine learning is to use a large amount of labeled data to train a model. Usually this model can then only be used to classify data in the same feature space. However, labeled data is often expensive to obtain. A number of strategies have been developed by the machine learning community in recent years to address this problem, including: semi-supervised learning,domain adaptation,multi-task learning,and self-taught learning. While training data and test may have different distributions, they must remain in the same feature set. Furthermore, all the above methods work in the same feature space. In this paper, we consider an extreme case of transfer learning called heterogeneous transfer learning — where the feature spaces of the source task and the target tasks are disjoint. Previous approaches mostly fall in the multi-view learning category, where co-occurrence data from both feature spaces is required. We generalize the previous work on cross-lingual adaptation and propose a multi-task strategy for the task. We also propose the use of a restricted Boltzmann machine (RBM), a special type of probabilistic graphical models, as an implementation. We present experiments on two tasks: action recognition and cross-lingual sentiment classification.
Learning Instance Specific Distance for Multi-Instance Classification
Wang, Hua (University of Texas at Arlington) | Nie, Feiping (University of Texas at Arlington) | Huang, Heng (University of Texas at Arlington)
Multi-Instance Learning (MIL) deals with problems where each training example is a bag, and each bag contains a set of instances. Multi-instance representation is useful in many real world applications, because it is able to capture more structural information than traditional flat single-instance representation. However, it also brings new challenges. Specifically, the distance between data objects in MIL is a set-to-set distance, which is harder to estimate than vector distances used in single-instance data. Moreover, because in MIL labels are assigned to bags instead of instances, although a bag belongs to a class, some, or even most, of its instances may not be truly related to the class. In order to address these difficulties, in this paper we propose a novel Instance Specific Distance (ISD) method for MIL, which computes the Class-to-Bag (C2B) distance by further considering the relevances of training instances with respect to their labeled classes. Taking into account the outliers caused by the weak label association in MIL, we learn ISD by solving an l0+-norm minimization problem. An efficient algorithm to solve the optimization problem is presented, together with the rigorous proof of its convergence. The promising results on five benchmark multi-instance data sets and two real world multi-instance applications validate the effectiveness of the proposed method.
A Feasible Nonconvex Relaxation Approach to Feature Selection
Gao, Cuixia (Zhejiang University) | Wang, Naiyan (Zhejiang University) | Yu, Qi (Zhejiang University) | Zhang, Zhihua (Zhejiang University)
Variable selection problems are typically addressed under apenalized optimization framework. Nonconvex penalties such as the minimax concave plus (MCP) and smoothly clipped absolute deviation(SCAD), have been demonstrated to have the properties of sparsity practically and theoretically. In this paper we propose a new nonconvex penalty that we call exponential-type penalty. The exponential-type penalty is characterized by a positive parameter,which establishes a connection with the ell 0 and ell 1 penalties.We apply this new penalty to sparse supervised learning problems. To solve to resulting optimization problem, we resort to a reweighted ell 1 minimization method. Moreover, we devise an efficient method for the adaptive update of the tuning parameter. Our experimental results are encouraging. They show that the exponential-type penalty is competitive with MCP and SCAD.
Selective Transfer Between Learning Tasks Using Task-Based Boosting
Eaton, Eric (Bryn Mawr College) | desJardins, Marie (University of Maryland Baltimore County)
The success of transfer learning on a target task is highly dependent on the selected source data. Instance transfer methods reuse data from the source tasks to augment the training data for the target task. If poorly chosen, this source data may inhibit learning, resulting in negative transfer. The current most widely used algorithm for instance transfer, TrAdaBoost, performs poorly when given irrelevant source data. We present a novel task-based boosting technique for instance transfer that selectively chooses the source knowledge to transfer to the target task. Our approach performs boosting at both the instance level and the task level, assigning higher weight to those source tasks that show positive transferability to the target task, and adjusting the weights of individual instances within each source task via AdaBoost. We show that this combination of task- and instance-level boosting significantly improves transfer performance over existing instance transfer algorithms when given a mix of relevant and irrelevant source data, especially for small amounts of data on the target task.
Similarity-Based Approach for Positive and Unlabelled Learning
Xiao, Yanshan (University of Technology, Sydney) | Liu, Bo (South China University of Technology) | Yin, Jie (CSIRO ICT Centre) | Cao, Longbing (University of Technology, Sydney) | Zhang, Chengqi (University of Technology, Sydney) | Hao, Zhifeng (Guangdong University of Technology)
Positive and unlabelled learning (PU learning) has been investigated to deal with the situation where only the positive examples and the unlabelled examples are available. Most of the previous works focus on identifying some negative examples from the unlabelled data, so that the supervised learning methods can be applied to build a classifier. However, for the remaining unlabelled data, which can not be explicitly identified as positive or negative (we call them ambiguous examples), they either exclude them from the training phase or simply enforce them to either class. Consequently, their performance may be constrained. This paper proposes a novel approach, called similarity-based PU learning (SPUL) method, by associating the ambiguous examples with two similarity weights, which indicate the similarity of an ambiguous example towards the positive class and the negative class, respectively. The local similarity-based and global similarity-based mechanisms are proposed to generate the similarity weights. The ambiguous examples and their similarity-weights are thereafter incorporated into an SVM-based learning phase to build a more accurate classifier. Extensive experiments on real-world datasets have shown that SPUL outperforms state-of-the-art PU learning methods.
A Natural Language Question Answering System as a Participant in Human Q&A Portals
Dong, Tiansi (University of Hagen) | Furbach, Ulrich (University Koblenz-Landau) | Glöckner, Ingo (University of Hagen) | Pelzer, Björn (University Koblenz-Landau)
LogAnswer is a question answering (QA) system for the German language, aimed at providing concise and correct answers to arbitrary questions. For this purpose LogAnswer is designed as an embedded artificial intelligence system which integrates methods from several fields of AI, namely natural language processing, machine learning, knowledge representation and automated theorem proving. We intend to employ LogAnswer as a virtual user within Internet-based QA forums, where it must be able to identify the questions that it cannot answer correctly, a task that normally receives little attention in QA research compared to the actual answer derivation. The paper presents a machine learning solution to the wrong answer avoidance (WAA) problem, applying a meta classifier to the output of simple term-based classifiers and a rich set of other WAA features. Experiments with a large set of real-world questions from a QA forum show that the proposed method significantly improves the WAA characteristics of our system.
Integrating Task Planning and Interactive Learning for Robots to Work in Human Environments
Agostini, Alejandro Gabriel (Institut de Robotica i Informatica Industrial (CSIC-UPC)) | Torras, Carme (Institut de Robotica i Informatica Industrial (CSIC-UPC)) | Wörgötter, Florentin (Bernstein Center for Computational Neuroscience)
Human environments are challenging for robots, which need to be trainable by lay people and learn new behaviours rapidly without disrupting much the ongoing activity. A system that integrates AI techniques for planning and learning is here proposed to satisfy these strong demands. The approach rapidly learns planning operators from few action experiences using a competitive strategy where many alternatives of cause-effect explanations are evaluated in parallel, and the most successful ones are used to generate the operators. The success of a cause-effect explanation is evaluated by a probabilistic estimate that compensates the lack of experience, producing more confident estimations and speeding up the learning in relation to other known estimates. The system operates without task interruption by integrating in the planning-learning loop a human teacher that supports the planner in making decisions. All the mechanisms are integrated and synchronized in the robot using a general decision-making framework. The feasibility and scalability of the architecture are evaluated in two different robot platforms: a Stäubli arm, and the humanoid ARMAR III.
Fusion of Multiple Features and Supervised Learning for Chinese OOV Term Detection and POS Guessing
Zhang, Yuejie (Fudan University) | Cen, Lei (Fudan University) | Wu, Wei (Fudan University) | Jin, Cheng (Fudan University) | Xue, Xiangyang (Fudan University)
In this paper, to support more precise Chinese Out-of-Vocabulary (OOV) term detection and Part-of-Speech (POS) guessing, a unified mechanism is proposed and formulated based on the fusion of multiple features and supervised learning. Besides all the traditional features, the new features for statistical information and global contexts are introduced, as well as some constraints and heuristic rules, which reveal the relationships among OOV term candidates. Our experiments on the Chinese corpora from both People’s Daily and SIGHAN 2005 have achieved the consistent results, which are better than those acquired by pure rule-based or statistics-based models. From the experimental results for combining our model with Chinese monolingual retrieval on the data sets of TREC-9, it is found that the obvious improvement for the retrieval performance can also be obtained.