Goto

Collaborating Authors

 Personal Assistant Systems


Statistical ranking and combinatorial Hodge theory

arXiv.org Machine Learning

We propose a number of techniques for obtaining a global ranking from data that may be incomplete and imbalanced -- characteristics almost universal to modern datasets coming from e-commerce and internet applications. We are primarily interested in score or rating-based cardinal data. From raw ranking data, we construct pairwise rankings, represented as edge flows on an appropriate graph. Our statistical ranking method uses the graph Helmholtzian, the graph theoretic analogue of the Helmholtz operator or vector Laplacian, in much the same way the graph Laplacian is an analogue of the Laplace operator or scalar Laplacian. We study the graph Helmholtzian using combinatorial Hodge theory: we show that every edge flow representing pairwise ranking can be resolved into two orthogonal components, a gradient flow that represents the L2-optimal global ranking and a divergence-free flow (cyclic) that measures the validity of the global ranking obtained -- if this is large, then the data does not have a meaningful global ranking. This divergence-free flow can be further decomposed orthogonally into a curl flow (locally cyclic) and a harmonic flow (locally acyclic but globally cyclic); these provides information on whether inconsistency arises locally or globally. An obvious advantage over the NP-hard Kemeny optimization is that discrete Hodge decomposition may be computed via a linear least squares regression. We also investigated the L1-projection of edge flows, showing that this is dual to correlation maximization over bounded divergence-free flows, and the L1-approximate sparse cyclic ranking, showing that this is dual to correlation maximization over bounded curl-free flows. We discuss relations with Kemeny optimization, Borda count, and Kendall-Smith consistency index from social choice theory and statistics.


Trading Robustness for Privacy in Decentralized Recommender Systems

AAAI Conferences

Collaborative filtering (CF) recommender systems are very popular and successful in commercial application fields. One end-user concern is the privacy of the personal data required by such systems in order to make personalized recommendations. Recently, peer-to-peer decentralized architectures have been proposed to address this privacy issue. On the other hand system managers must be concerned about system robustness. In particular, it has been shown that recommender systems are vulnerable to profile injection, although model-based CF algorithms show greater stability against malicious attacks that have been studied in the state-of-the-art. In this paper we generalize the generic model for decentralized recommendation and discuss the trade-off between robustness and privacy. In this context, we argue that exposing knowledge of the model parameters allows new, highly effective, model-based attack strategies to be considered. We conclude that the security concerns of privacy and robustness stand in opposition to each other and are difficult to satisfy simultaneously.


Evaluating User-Adaptive Systems: Lessons from Experiences with a Personalized Meeting Scheduling Assistant

AAAI Conferences

We discuss experiences from evaluating the learning performance of a user-adaptive personal assistant agent.  We discuss the challenge of designing adequate evaluation and the tension of collecting adequate data without a fully functional, deployed system.  Reflections on negative and positive experiences point to the challenges of evaluating user-adaptive AI systems.  Lessons learned concern early consideration of evaluation and deployment, characteristics of AI technology and domains that make controlled evaluations appropriate or not, holistic experimental design, implications of "in the wild" evaluation, and the effect of AI-enabled functionality and its impact upon existing tools and work practices.


SmartChoice: An Online Recommender System to Support Low-Income Families in Public School Choice

AI Magazine

Public school choice at the primary and secondary levels is a keyelement of the U.S. No Child Left Behind Act of 2001 (NCLB). If aschool does not meet assessment goals for two consecutive years, bylaw the district must offer students the opportunity to transfer to aschool that is meeting its goals. Thus we have developed an online,content-based recommender system, called SmartChoice. Itprovides parents with school recommendations for individual studentsbased on parents' preferences and students' needs, interests,abilities, and talents.


SmartChoice: An Online Recommender System to Support Low-Income Families in Public School Choice

AI Magazine

Public school choice at the primary and secondary levels is a keyelement of the U.S. No Child Left Behind Act of 2001 (NCLB).  If aschool does not meet assessment goals for two consecutive years, bylaw the district must offer students the opportunity to transfer to aschool that is meeting its goals.  Making a choice with such potentialimpact on a child's future is clearly monumental, yet astonishinglyfew parents take advantage of the opportunity.  Our research has shownthat a significant part of the problem arises from issues ininformation access and information overload, particularly for lowsocioeconomic status families.  Thus we have developed an online,content-based recommender system, called SmartChoice.  Itprovides parents with school recommendations for individual studentsbased on parents' preferences and students' needs, interests,abilities, and talents.  The first version of the online applicationwas deployed and live for focus group participants who used it for theJanuary and March/April 2008 Charlotte-Mecklenburg school choiceperiods.  This article describes the SmartChoice Program and theresults of our initial and followup studies with participants.


Spatial Processes for Recommender Systems

AAAI Conferences

Spatial processes are typically used to analyse and predict geographic data. This paper adapts such models to predicting a user's interests (i.e., implicit item ratings) within a recommender system in the museum domain. We present the theoretical framework for a model based on Gaussian spatial processes, and discuss efficient algorithms for parameter estimation. Our model was evaluated with a real-world dataset collected by tracking visitors in a museum, attaining a higher predictive accuracy than state-of-the-art collaborative filters.


Incorporating User Behaviors in New Word Detection

AAAI Conferences

In this paper, we proposed a novel method to detect new words in domain-specific fields based on user behaviors. First, we select the most representative words from domain-specific lexicon. Then combining with user behaviors, we try to discover the potential experts in this field who use those terminologies frequently. Finally, we make further efforts to identify new words from behaviors of those experts. Words used much more frequently in this community than others are most probably new words. In brief, our method follows a collaborative filtering way: first from words to find professional experts, then from experts to discover new words, which is different from the traditional new word detection methods. Our method achieves up to 0.86 in accuracy on a computer science related data set. Moreover, the proposed method can be easily extended to related words retrieval task. We compare our method with Google Sets and Bayesian Sets. Experiments show that our method and Bayesian Sets gives better results than Google Sets.


Can Movies and Books Collaborate? Cross-Domain Collaborative Filtering for Sparsity Reduction

AAAI Conferences

The sparsity problem in collaborative filtering (CF) is a major bottleneck for most CF methods. In this paper, we consider a novel approach for alleviating the sparsity problem in CF by transferring user-item rating patterns from a dense auxiliary rating matrix in other domains (e.g., a popular movie rating website) to a sparse rating matrix in a target domain (e.g., a new book rating website). We do not require that the users and items in the two domains be identical or even overlap. Based on the limited ratings in the target matrix, we establish a bridge between the two rating matrices at a cluster-level of user-item rating patterns in order to transfer more useful knowledge from the auxiliary task domain. We first compress the ratings in the auxiliary rating matrix into an informative and yet compact cluster-level rating pattern representation referred to as a codebook. Then, we propose an efficient algorithm for reconstructing the target rating matrix by expanding the codebook. We perform extensive empirical tests to show that our method is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary tasks, as compared to many state-of-the-art CF methods.


Improving Search In Social Networks by Agent Based Mining

AAAI Conferences

Users share and access large volumes of information on social networking sites like Facebook, Flickr, del.icio.us, etc. Whereas a few of these sites have generic, impersonal searching mechanisms, we have developed an agent-based framework that mines the social network of a user to improve search results. Our Social Network-based Item Search (SNIS) system uses agents that utilize the connections of a user in the social network to facilitate the search for items of interest. Our approach generates targeted search results that can improve the precision of the result returned from a user's query. We have implemented the SNIS agent-based framework in Flickr, a photo-sharing social network, for searching for photos by using tag lists as search queries. We discuss the architecture of SNIS, motivate the searching scheme used, and demonstrate the effectiveness of the SNIS approach by presenting results. We also show how SNIS can be utilized for expertise location.


Sketching Techniques for Collaborative Filtering

AAAI Conferences

Recommender systems attempt to highlight items that a target user is likely to find interesting. A common technique is to use collaborative filtering (CF), where multiple users share information so as to provide each with effective recommendations. A key aspect of CF systems is finding users whose tastes accurately reflect the tastes of some target user. Typically, the system looks for other agents who have had experience with many of the items the target user has examined, and whose classification of these items has a strong correlation with the classifications of the target user. Since the universe of items may be enormous and huge data sets are involved, sophisticated methods must be used to quickly locate appropriate other agents. We present a method for quickly determining the proportional intersection between the items that each of two users has examined, by sending and maintaining extremely concise “sketches” of the list of items. These sketches enable the approximation of the proportional intersection within a distance of \epsilon, with a high probability of 1-\delta. Our sketching techniques are based on random minwise independent hash functions, and use very little space and time, so they are well-suited for use in large-scale collaborative filtering systems.