Goto

Collaborating Authors

 Asia


Learning from Natural Instructions

AAAI Conferences

Machine learning is traditionally formalized and researched as the study of learning concepts and decision functions from labeled examples, requiring a representation that encodes information about the domain of the decision function to be learned. We are interested in providing a way for a human teacher to interact with an automated learner using natural instructions, thus allowing the teacher to communicate the relevant domain expertise to the learner without necessarily knowing anything about the internal representations used in the learning process. In this paper we suggest to view the process of learning a decision function as a natural language lesson interpretation problem instead of learning from labeled examples. This interpretation of machine learning is motivated by human learning processes, in which the learner is given a lesson describing the target concept directly, and a few instances exemplifying it. We introduce a learning algorithm for the lesson interpretation problem that gets feedback from its performance on the final task, while learning jointly (1) how to interpret the lesson and (2) how to use this interpretation to do well on the final task. his approach alleviates the supervision burden of traditional machine learning by focusing on supplying the learner with only human-level task expertise for learning. We evaluate our approach by applying it to the rules of the Freecell solitaire card game. We show that our learning approach can eventually use natural language instructions to learn the target concept and play the game legally. Furthermore, we show that the learned semantic interpreter also generalizes to previously unseen instructions.


Short Text Classification Improved by Learning Multi-Granularity Topics

AAAI Conferences

Understanding the rapidly growing short text is very important. Short text is different from traditional documents in its shortness and sparsity, which hinders the application of conventional machine learning and text mining algorithms. Two major approaches have been exploited to enrich the representation of short text. One is to fetch contextual information of a short text to directly add more text; the other is to derive latent topics from existing large corpus, which are used as features to enrich the representation of short text. The latter approach is elegant and efficient in most cases. The major trend along this direction is to derive latent topics of certain granularity through well-known topic models such as latent Dirichlet allocation (LDA). However, topics of certain granularity are usually not sufficient to set up effective feature spaces. In this paper, we move forward along this direction by proposing an method to leverage topics at multiple granularity, which can model the short text more precisely. Taking short text classification as an example, we compared our proposed method with the state-of-the-art baseline over one open data set. Our method reduced the classification error by 20.25% and 16.68%respectively on two classifiers.


Semantic Relationship Discovery with Wikipedia Structure

AAAI Conferences

Thanks to the idea of social collaboration, Wikipedia has accumulated vast amount of semi-structured knowledge in which the link structure reflects human's cognition on semantic relationship to some extent. In this paper, we proposed a novel method RCRank to jointly compute concept-concept relatedness and concept-category relatedness base on the assumption that information carried in concept-concept links and concept-category links can mutually reinforce each other. Different from previous work, RCRank can not only find semantically related concepts but also interpret their relations by categories. Experimental results on concept recommendation and relation interpretation show that our method substantially outperforms classical methods.


Learning Bilingual Lexicons Using the Visual Similarity of Labeled Web Images

AAAI Conferences

Speakers of many different languages use the Internet. A common activity among these users is uploading images and associating these images with words (in their own language) as captions, filenames, or surrounding text. We use these explicit, monolingual, image-to-word connections to successfully learn implicit, bilingual, word-to-word translations. Bilingual pairs of words are proposed as translations if their corresponding images have similar visual features. We generate bilingual lexicons in 15 language pairs, focusing on words that have been automatically identified as physical objects. The use of visual similarity substantially improves performance over standard approaches based on string similarity: for generated lexicons with 1000 translations, including visual information leads to an absolute improvement in accuracy of 8-12% over string edit distance alone.


Learning Cause Identifiers from Annotator Rationales

AAAI Conferences

In the aviation safety research domain, cause identification refers to the task of identifying the possible causes responsible for the incident describedin an aviation safety incident report. This task presents a number of challenges, including the scarcity of labeled data and the difficulties in finding the relevant portions of the text. We investigate the use of annotator rationales to overcome these challenges, proposing several new ways of utilizing rationales and showing that through judicious use of the rationales, it is possible to achieve significant improvement over a unigram SVM baseline.


Multi-Select Faceted Navigation Based on Minimum Description Length Principle

AAAI Conferences

Faceted navigation can effectively reduce user efforts of reaching targeted resources in databases, by suggesting dynamic facet values for iterative query refinement. A key issue is minimizing the navigation cost in a user query session. Conventional navigation scheme assumes that at each step, users select only one suggested value to figure out resources containing it. To make faceted navigation more flexible and effective, this paper introduces a multi-select scheme where multiple suggested values can be selected at one step, and a selected value can be used to either retain or exclude the resources containing it. Previous algorithms for cost-driven value suggestion can hardly work well under our navigation scheme. Therefore, we propose to optimize the navigation cost using the Minimum Description Length principle, which can well balance the number of navigation steps and the number of suggested values per step under our new scheme. An emperical study demonstrates that our approach is more cost-saving and efficient than state-of-the-art approaches.


Effective and Efficient Microprocessor Design Space Exploration Using Unlabeled Design Configurations

AAAI Conferences

During the design of a microprocessor, Design Space Exploration (DSE) is a critical step which determines the appropriate design configuration of the microprocessor. In the computer architecture community, supervised learning techniques have been applied to DSE to build models for predicting the qualities of design configurations. For supervised learning, however, considerable simulation costs are required for attaining the labeled design configurations. Given limited resources, it is difficult to achieve high accuracy. In this paper, inspired by recent advances in semi-supervised learning, we propose the COMT approach which can exploit unlabeled design configurations to improve the models. In addition to an improved predictive accuracy, COMT is able to guide the design of microprocessors, owing to the use of comprehensible model trees. Empirical study demonstrates that COMT significantly outperforms state-of-the-art DSE technique through reducing mean squared error by 30% to 84%, and thus, promising architectures can be attained more efficiently.


OCS-14: You Can Get Occluded in Fourteen Ways

AAAI Conferences

Occlusions are a central phenomenon in multi-object computer vision. However, formal analyses (LOS14, ROC20) proposed in the spatial reasoning literature ignore many distinctions crucial to computer vision, as a result of which these algebras have been largely ignored in vision applications. Two distinctions of relevance to visual computation are (a) whether the occluder is a moving object or part of the static background, and (b) whether the visible part of an object is a connected blob or fragmented. In this work, we develop a formal model of occlusion states that combines these criteria with overlap distinctions modeled in spatial reasoning to come up with a comprehensive set of fourteen occlusion states, which we define as OCS14. Transitions between these occlusion states are an important source of information on visual activity (e.g. splits and merges). We show that the resulting formalism is representationally complete in the sense that these states constitute a partition of all possible occlusion situations based on these criteria. Finally, we show results from implementations of this approach in a test application involving static camera based scene analysis, where occlusion state analysis and multiple object tracking can be used for two tasks -- (a) identifying static occluders, and (b) modeling a class of interactions represented as transitions of occlusion states. Thus, the formalism is shown to have direct relevance to actual vision applications.


Finding "Unexplained" Activities in Video

AAAI Conferences

Consider a video surveillance application that monitors some location. The application knows a set of activity models (that are either normal or abnormal or both), but in addition, the application wants to find video segments that are unexplained by any of the known activity models โ€” these unexplained video segments may correspond to activities for which no previous activity model existed. In this paper, we formally define what it means for a given video segment to be unexplained (totally or partially) w.r.t. a given set of activity models and a probability threshold. We develop two algorithms โ€“ FindTUA and FindPUA โ€“ to identify Totally and Partially Unexplained Activities respectively, and show that both algorithms use important pruning methods. We report on experiments with a prototype implementation showing that the algorithms both run efficiently and are accurate.


Pattern Field Classification with Style Normalized Transformation

AAAI Conferences

Field classification is an extension of the traditional classification framework, by breaking the i.i.d. assumption. In field classification, patterns occur as groups (fields) of homogeneous styles. By utilizing style consistency, classifying groups of patterns is often more accurate than classifying single patterns. In this paper, we extend the Bayes decision theory, and develop the Field Bayesian Model (FBM) to deal with field classification. Specifically, we propose to learn a Style Normalized Transformation (SNT) for each field. Via the SNTs, the data of different fields are transformed to a uniform style space (i.i.d. space). The proposed model is a general and systematic framework, under which many probabilistic models can be easily extended for field classification. To transfer the model to unseen styles, we propose a transductive model called Transfer Bayesian Rule (TBR) based on self-training. We conducted extensive experiments on face, speech and a large-scale handwriting dataset, and got significant error rate reduction compared to the state-of-the-art methods.