Goto

Collaborating Authors

 Statistical Learning


Learning from Concept Drifting Data Streams with Unlabeled Data

AAAI Conferences

Contrary to the previous beliefs that all arrived streaming data are labeled and the class labels are immediately availa- ble, we propose a Semi-supervised classification algorithm for data streams with concept drifts and UNlabeled data, called SUN. SUN is based on an evolved decision tree. In terms of deviation between history concept clusters and new ones generated by a developed clustering algorithm of k-Modes, concept drifts are distinguished from noise at leaves. Extensive studies on both synthetic and real data demonstrate that SUN performs well compared to several known online algorithms on unlabeled data. A conclusion is hence drawn that a feasible reference framework is provided for tackling concept drifting data streams with unlabeled data.


Control Model Learning for Whole-Body Mobile Manipulation

AAAI Conferences

The ability to discover the effects of actions and apply this knowledge during goal-oriented action selection is a fundamental requirement of embodied intelligent agents. In our ongoing work, we hope to demonstrate the utility of learned control models for whole-body mobile manipulation. In this short paper we discuss preliminary work on learning a forward model of the dynamics of a balancing robot exploring simple arm movements. This model is then used to construct whole-body control strategies for regulating state variables using arm motion.


Interactive Categorization of Containers and Non-Containers by Unifying Categorizations Derived from Multiple Exploratory Behaviors

AAAI Conferences

The ability to form object categories is an important milestone in human infant development (Cohen 2003). We propose a framework that allows a robot to form a unified object categorization from several interactions with objects. This framework is consistent with the principle that robot a) Drop Block b) Grasp c) Move learning should be ultimately grounded in the robot's perceptual and behavioral repertoire (Stoytchev 2009). This paper builds upon our previous work (Griffith et al. 2009) by adding more exploratory behaviors (now 6 instead of 1) and by employing consensus clustering for finding a single, unified object categorization. The framework was tested on a container/non-container categorization task with 20 objects.


Evolving Compiler Heuristics to Manage Communication and Contention

AAAI Conferences

As computer architectures become increasingly complex, hand-tuning compiler heuristics becomes increasingly tedious and time consuming for compiler developers. This paper presents a case study that uses a genetic algorithm to learn a compiler policy. The target policy implicitly balances communication and contention among processing elements of the TRIPS processor, a physically realized prototype chip. We learn specialized policies for individual programs as well as general policies that work well across all programs. We also employ a two-stage method that first classifies the code being compiled based on salient characteristics, and then chooses a specialized policy based on that classification. This work is particularly interesting for the AI community because it 1) emphasizes the need for increased collaboration between AI researchers and researchers from other branches of computer science and 2) discusses a machine learning setup where training on the custom hardware requires weeks of training, rather than the more typical minutes or hours.


A Layered Approach to People Detection in 3D Range Data

AAAI Conferences

People tracking is a key technology for autonomous systems, intelligent cars and social robots operating in populated environments. What makes the task difficult is that the appearance of humans in range data can change drastically as a function of body pose, distance to the sensor, self-occlusion and occlusion by other objects. In this paper we propose a novel approach to pedestrian detection in 3D range data based on supervised learning techniques to create a bank of classifiers for different height levels of the human body. In particular, our approach applies AdaBoost to train a strong classifier from geometrical and statistical features of groups of neighboring points at the same height. In a second step, the AdaBoost classifiers mutually enforce their evidence across different heights by voting into a continuous space. Pedestrians are finally found efficiently by mean-shift search for local maxima in the voting space. Experimental results carried out with 3D laser range data illustrate the robustness and efficiency of our approach even in cluttered urban environments. The learned people detector reaches a classification rate up to 96% from a single 3D scan.


Community-Guided Learning: Exploiting Mobile Sensor Users to Model Human Behavior

AAAI Conferences

Modeling human behavior requires vast quantities of accurately labeled training data, but for ubiquitous people-aware applications such data is rarely attainable. Even researchers make mistakes when labeling data, and consistent, reliable labels from low-commitment users are rare. In particular, users may give identical labels to activities with characteristically different signatures (e.g., labeling eating at home or at a restaurant as "dinner") or may give different labels to the same context (e.g., "work" vs. "office"). In this scenario, labels are unreliable but nonetheless contain valuable information for classification. To facilitate learning in such unconstrained labeling scenarios, we propose Community-Guided Learning (CGL), a framework that allows existing classifiers to learn robustly from unreliably-labeled user-submitted data. CGL exploits the underlying structure in the data and the unconstrained labels to intelligently group crowd-sourced data. We demonstrate how to use similarity measures to determine when and how to split and merge contributions from different labeled categories and present experimental results that demonstrate the effectiveness of our framework.


Activity and Gait Recognition with Time-Delay Embeddings

AAAI Conferences

Activity recognition based on data from mobile wearable devices is becoming an important application area for machine learning. We propose a novel approach based on a combination of feature extraction using time-delay embedding and supervised learning. The computational requirements are considerably lower than existing approaches, so the processing can be done in real time on a low-powered portable device such as a mobile phone. We evaluate the performance of our algorithm on a large, noisy data set comprising over 50 hours of data from six different subjects, including activities such as running and walking up or down stairs. We also demonstrate the ability of the system to accurately classify an individual from a set of 25 people, based only on the characteristics of their walking gait. The system requires very little parameter tuning, and can be trained with small amounts of data.


Commonsense Knowledge Mining from the Web

AAAI Conferences

Good and generous knowledge sources, reliable and efficient induction patterns, and automatic and controllable quality assertion approaches are three critical issues to commonsense knowledge (CSK) acquisition. This paper employs Open Mind Common Sense (OMCS), a volunteers-contributed CSK database, to study the first and the third issues. For those stylized CSK, our result shows that over 40% of CSK for four predicate types in OMCS can be found in the web, which contradicts to the assumption that CSK is not communicated in texts. Moreover, we propose a commonsense knowledge classifier trained from OMCS, and achieve high precision in some predicate types, e.g., 82.6% in HasProperty. The promising results suggest new ways of analyzing and utilizing volunteer-contributed knowledge to design systems automatically mining commonsense knowledge from the web.


Towards an Intelligent Code Search Engine

AAAI Conferences

Software developers increasingly rely on information from the Web, such as documents or code examples on Application Programming Interfaces (APIs), to facilitate their development processes. However, API documents often do not include enough information for developers to fully understand the API usages, while searching for good code examples requires non-trivial efforts. To address this problem, we propose a novel code search engine, combining the strength of browsing documents and searching for code examples, by returning documents embedded with high-quality code example summaries mined from the Web. Our evaluation results show that our approach provides code examples with high precision and boosts programmer productivity.


Visual Contextual Advertising: Bringing Textual Advertisements to Images

AAAI Conferences

Advertising in the case of textual Web pages has been studied extensively by many researchers. However, with the increasing amount of multimedia data such as image, audio and video on the Web, the need for recommending advertisement for the multimedia data is becoming a reality. In this paper, we address the novel problem of visual contextual advertising, which is to directly advertise when users are viewing images which do not have any surrounding text. A key challenging issue of visual contextual advertising is that images and advertisements are usually represented in image space and word space respectively, which are quite different with each other inherently. As a result, existing methods for Web page advertising are inapplicable since they represent both Web pages and advertisement in the same word space. In order to solve the problem, we propose to exploit the social Web to link these two feature spaces together. In particular, we present a unified generative model to integrate advertisements, words and images. Specifically, our solution combines two parts in a principled approach: First, we transform images from a image feature space to a word space utilizing the knowledge from images with annotations from social Web. Then, a language model based approach is applied to estimate the relevance between transformed images and advertisements. Moreover, in this model, the probability of recommending an advertisement can be inferred efficiently given an image, which enables potential applications to online advertising.