Goto

Collaborating Authors

 Media


The News that Matters to You: Design and Deployment of a Personalized News Service

AAAI Conferences

With the growth of online information, many people are challenged in finding and reading the information most important for their interests. From 2008-2010 we built an experimental personalized news system where readers can subscribe to organized channels of information that are curated by experts. AI technology was employed to radically reduce the work load of curators and to efficiently present information to readers. The system has gone through three implementation cycles and processed over 16 million news stories from about 12,000 RSS feeds on over 8000 topics organized by 160 curators for over 600 registered readers. This paper describes the approach, engineering and AI technology of the system.


NewsFinder: Automating an Artificial Intelligence News Service

AAAI Conferences

NewsFinder automates the steps involved in finding, selecting and publishing news stories that meet subjective judgments of relevance and interest to the Artificial Intelligence community. NewsFinder combines a broad search with AI-specific filters and incorporates a learning program whose judgment of interestingness of stories can be trained by feedback from readers. Since August, 2010, the program has been used to operate the AI in the News service that is part of the AAAI AITopics site.


Science Fiction as an Introduction to AI Research

AAAI Conferences

The undergraduate computer science curriculum is generally focused on skills and tools;  most students are not exposed to much  research in the field, and do not learn how to navigate the research literature.  We describe how science fiction reviews were used as a gateway to research reviews.  Students learn a little about current or recent research on a topic that stirs their imagination, and learn how to search for, read critically, and compare technical papers on a topic related their chosen science fiction book, movie, or TV show.


Active Dual Collaborative Filtering with Both Item and Attribute Feedback

AAAI Conferences

The new user problem (aka user cold start) is very common in online recommender systems. Active collaborative filtering (active CF) tries to solve this problem by intelligently soliciting user feedback in order to build an initial user profile with minimal costs. Existing methods only query the user for feedback on items, while users can have preferences over items as well as certain item attributes. In this paper, we extend active CF via user feedback on both items and attributes. For example, when making movie recommendations, the system can ask users for not only their favorite movies, but also attributes such as genres, actors, etc. We design a unified active CF framework for incorporating both item and attribute feedback based on the random walk model. We test the active CF algorithm on real-world movie recommendation data sets to demonstrate that appropriately querying for both item and feature feedback can significantly reduce the overall user effort measured in terms of number of queries. We show that we can achieve much better recommendation quality as compared to traditional active CF methods that support only item feedback.


Tracking User-Preference Varying Speed in Collaborative Filtering

AAAI Conferences

In real-world recommender systems, some users are easily influenced by new products and whereas others are unwilling to change their minds. So the preference varying speeds for users are different. Based on this observation, we propose a dynamic nonlinear matrix factorization model for collaborative filtering, aimed to improve the rating prediction performance as well as track the preference varying speeds for different users. We assume that user-preference changes smoothly over time, and the preference varying speeds for users are different. These two assumptions are incorporated into the proposed model as prior knowledge on user feature vectors, which can be learned efficiently by MAP estimation. The experimental results show that our method not only achieves state-of-the-art performance in the rating prediction task, but also provides an effective way to track user-preference varying speed.


Relation Adaptation: Learning to Extract Novel Relations with Minimum Supervision

AAAI Conferences

Extracting the relations that exist between two entities is an important step in numerousWeb-related tasks such as information extraction.A supervised relation extraction system that is trained to extract a particular relation type might not accurately extract a new type of a relation for which it has not been trained.However, it is costly to create training data manually for every new relation type that one might want to extract.We propose a method to adapt an existing relation extraction system to extractnew relation types with minimum supervision. Our proposed method comprises two stages: learning a lower-dimensional projection between different relations, and learning a relational classifier for the target relation type with instance sampling. We evaluate the proposed method using a dataset that contains 2000 instances for 20 different relation types. Our experimental results show that the proposed method achieves a statistically significant macro-average F-score of 62.77. Moreover, the proposed method outperforms numerous baselines and a previously proposed weakly-supervised relation extraction method.


Connecting the Dots Between News Articles

AAAI Conferences

The process of extracting useful knowledge from large datasets has become one of the most pressing problems in today’s society. The problem spans entire sectors, from scientists to intelligence analysts and web users, all of whom are constantly struggling to keep up with the larger and larger amounts of content published every day. With this much data, it is often easy to miss the big picture. In this paper, we investigate methods for automatically connecting the dots – providing a structured, easy way to navigate within a new topic and discover hidden connections. We focus on the news domain: given two news articles, our system automatically finds a coherent chain linking them together. For example, it can recover the chain of events leading from the decline of home prices (2007) to the health-care debate (2009). We formalize the characteristics of a good chain and provide efficient algorithms to connect two fixed endpoints. We incorporate user feedback into our framework, allowing the stories to be refined and personalized. Finally, we evaluate our algorithm over real news data. Our user studies demonstrate the algorithm's effectiveness in helping users understanding the news.


Evaluation of Group Profiling Strategies

AAAI Conferences

Most of the existing personalization systems such as content recommenders or targeted ads focus on individual users and ignore the social situation in which the services are consumed. However, many human activities are social and involve several in-dividuals whose tastes and expectations must be taken into account by the system. When a group profile is not available, different profile aggrega-tion strategies can be applied to recommend ade-quate items to a group of users based on their indi-vidual profiles. We consider an approach intended to determine the factors that influence the choice of an aggregation strategy. We present evaluations made on a large-scale dataset of TV viewings, where real group interests are compared to the pre-dictions obtained by combining individual user profiles according to different strategies.


Recommender Systems: Missing Data and Statistical Model Estimation

AAAI Conferences

The personalization aspect of recommender systems makes them well suited to applications in The goal of rating-based recommender systems is electronic commerce and entertainment, while the fact that to make personalized predictions and recommendations they do not rely on text-based descriptions of items makes for individual users by leveraging the preferences them well suited to content like movies and music. of a community of users with respect to a In this paper, we focus on a key problem in rating-based collection of items like songs or movies. Recommender collaborative filtering: the possibility of a basic incompatibility systems are often based on intricate statistical between the properties of recommender system data sets models that are estimated from data sets containing and the assumptions required for valid estimation and evaluation a very high proportion of missing ratings. of statistical models in the presence of missing data. This work describes evidence of a basic incompatibility We describe properties of recommender system data sets and between the properties of recommender relate them to the statistical theory of model estimation in system data sets and the assumptions required for the presence of nonrandom missing data. We describe an valid estimation and evaluation of statistical models extended modelling framework and a modified set of evaluation in the presence of missing data. We discuss the protocols for dealing with nonrandom missing data.


A Wikipedia Based Semantic Graph Model for Topic Tracking in Blogosphere

AAAI Conferences

There are two key issues for information diffusion in blogosphere: (1) blog posts are usually short, noisy and contain multiple themes, (2) information diffusion through blogosphere is primarily driven by the “word-of-mouth” effect, thus making topics evolve very fast. This paper presents a novel topic tracking approach to deal with these issues by modeling a topic as a semantic graph in which the semantic relatedness between terms are learned from Wikipedia. For a given topic/post, the named entities, Wikipedia concepts, and the semantic relatedness are extracted to generate the graph model. Noises are filtered out through a graph clustering algorithm. To handle topic evolution, the topic model is enriched by using Wikipedia as background knowledge. Furthermore, graph edit distance is used to measure the similarity between a topic and its posts. The proposed method is tested using real-world blog data. Experimental results show the advantage of the proposed method on tracking topics in short, noisy text.