Lehigh University
Emerging Innovative Applications of Artificial Intelligence 2012
Fromherz, Markus (Xerox) | Muñoz-Avila, Hector (Lehigh University)
Plan-Based Character Diversity
Coman, Alexandra (Lehigh University) | Munoz-Avila, Hector (Lehigh University)
Non-player character diversity enriches game environments increasing their replay value. We propose a method for obtaining character behavior diversity based on the diversity of plans enacted by characters, and demonstrate this method in a scenario in which characters have multiple choices. Using case-based planning techniques, we reuse plans for varied character behavior, which simulate different personality traits.
CLASSQ-L: A Q-Learning Algorithm for Adversarial Real-Time Strategy Games
Jaidee, Ulit (Lehigh University) | Munoz-Avila, Hector (Lehigh University)
We present CLASS Q-L (for: class Q-learning) an application of the Q-learning reinforcement learning algorithm to play complete Wargus games. Wargus is a real-time strategy game where players control armies consisting of units of different classes (e.g., archers, knights). CLASS Q-L uses a single table for each class of unit so that each unit is controlled and updates its class’ Q-table. This enables rapid learning as in Wargus there are many units of the same class. We present initial results of CLASS Q-L against a variety of opponents.
Solution Diversity in Planning
Coman, Alexandra (Lehigh University)
Building Contextual Anchor Text Representation using Graph Regularization
Dai, Na (Lehigh University)
Anchor texts are useful complementary description for target pages, widely applied to improve search relevance. The benefits come from the additional information introduced into document representation and the intelligent ways of estimating their relative importance. Previous work on anchor importance estimation treated anchor text independently without considering its context. As a result, the lack of constraints from such context fails to guarantee a stable anchor text representation. We propose an anchor graph regularization approach to incorporate constraints from such context into anchor text weighting process, casting the task into a convex quadratic optimization problem. The constraints draw from the estimation of anchor-anchor, anchor-page, and page-page similarity. Based on any estimators, our approach operates as a post process of refining the estimated anchor weights, making it a plug and play component in search infrastructure. Comparable experiments on standard data sets (TREC 2009 and 2010) demonstrate the efficacy of our approach.
Normalizing Microtext
Xue, Zhenzhen (Lehigh University) | Yin, Dawei (Lehigh University) | Davison, Brian D. (Lehigh University)
The use of computer mediated communication has resulted in a new form of written text--Microtext--which is very different from well-written text. Tweets and SMS messages, which have limited length and may contain misspellings, slang, or abbreviations, are two typical examples of microtext. Microtext poses new challenges to standard natural language processing tools which are usually designed for well-written text. The objective of this work is to normalize microtext, in order to produce text that could be suitable for further treatment. We propose a normalization approach based on the source channel model, which incorporates four factors, namely an orthographic factor, a phonetic factor, a contextual factor and acronym expansion. Experiments show that our approach can normalize Twitter messages reasonably well, and it outperforms existing algorithms on a public SMS data set.
Temporal Dynamics of User Interests in Tagging Systems
Yin, Dawei (Lehigh University) | Hong, Liangjie (Lehigh University) | Xue, Zhenzhen (Lehigh University) | Davison, Brian D. (Lehigh University)
Collaborative tagging systems are now deployed extensivelyto help users share and organize resources.Tag prediction and recommendation systems generallymodel user behavior as research has shown that accuracycan be significantly improved by modeling users’preferences. However, these preferences are usuallytreated as constant over time, neglecting the temporalfactor within users’ interests. On the other hand, littleis known about how this factor may influence predictionin social bookmarking systems. In this paper, weinvestigate the temporal dynamics of user interests intagging systems and propose a user-tag-specific temporalinterests model for tracking users’ interests overtime. Additionally, we analyze the phenomenon of topicswitches in social bookmarking systems, showing that atemporal interests model can benefit from the integrationof topic switch detection and that temporal characteristicsof social tagging systems are different fromtraditional concept drift problems. We conduct experimentson three public datasets, demonstrating the importanceof personalization and user-tag specializationin tagging systems. Experimental results show that ourmethod can outperform state-of-the-art tag predictionalgorithms. We also incorporate our model within existingcontent-based methods yielding significant improvementsin performance.
Generating Diverse Plans Using Quantitative and Qualitative Plan Distance Metrics
Coman, Alexandra (Lehigh University) | Munoz-Avila, Hector (Lehigh University)
Diversity-aware planning consists of generating multiple plans which, while solving the same problem, are dissimilar from one another. Quantitative plan diversity is domain-independent and does not require extensive knowledge-engineering effort, but can fail to reflect plan differences that are relevant to users. Qualitative plan diversity is based on domain-specific characteristics, thus being of greater practical value, but may require substantial knowledge engineering. We demonstrate a domain-independent diverse plan generation method that is based on customizable plan distance metrics and amenable to both quantitative and qualitative diversity. Qualitative plan diversity is obtained with minimal knowledge-engineering effort, using distance metrics which incorporate domain-specific content.
A Bootstrapping Approach to Identifying Relevant Tweets for Social TV
Dan, Ovidiu (Lehigh University) | Feng, Junlan (AT&T Labs Research) | Davison, Brian D. (Lehigh University)
Manufacturers of TV sets have recently started adding social media features to their products. Some of these products display microblogging messages relevant to the TV show which the user is currently watching. However, such systems suffer from low precision and recall when they use the title of the show to search for relevant messages. Titles of some popular shows such as Lost or Survivor are highly ambiguous, resulting in messages unrelated to the show. Thus, there is a need to develop filtering algorithms that can achieve both high precision and recall. Filtering microblogging messages for Social TV poses several challenges, including lack of training data, lack of proper grammar and capitalization, lack of context due to text sparsity, etc. We describe a bootstrapping algorithm which uses a small manually labeled dataset, a large dataset of unlabeled messages, and some domain knowledge to derive a high precision classifier that can successfully filter microblogging messages which discuss television shows. The classifier is designed to generalize to TV shows which were not part of the training set. The algorithm achieves high precision on our two test datasets and successfully generalizes to unseen television shows. Furthermore, it compares favorably to a text classifier specifically trained on the television shows used for testing.
The Special Issue of AI Magazine on Structured Knowledge Transfer
Shapiro, Daniel G. (Institute for the Study of Learning and Expertise) | Munoz-Avila, Hector (Lehigh University) | Stracuzzi, David (Sandia National Laboratories)
This issue summarizes the state of the art in structured knowledge transfer, which is an emerging approach to the general problem of knowledge acquisition and reuse. Its goal is to capture, in a general form, the internal structure of the objects, relations, strategies, and processes used to solve tasks drawn from a source domain, and exploit that knowledge to improve performance in a target domain.