Goto

Collaborating Authors

 Personal Assistant Systems


Active Learning for Matching Problems

arXiv.org Artificial Intelligence

Effective learning of user preferences is critical to easing user burden in various types of matching problems. Equally important is active query selection to further reduce the amount of preference information users must provide. We address the problem of active learning of user preferences for matching problems, introducing a novel method for determining probabilistic matchings, and developing several new active learning strategies that are sensitive to the specific matching objective. Experiments with real-world data sets spanning diverse domains demonstrate that matching-sensitive active learning outperforms standard techniques.


Robustness and Accuracy Tradeoffs for Recommender Systems Under Attack

AAAI Conferences

Recommender systems assist users in the daunting task of sifting through large amounts of data in order to select relevant information or items. Common examples include consumer products and services, such as for songs, books, articles, etc. Unfortunately, such systems may be subject to attack by malicious users who want to manipulate the systemโ€™s recommendations to suit their needs: to promote their own (or demote a competitorโ€™s) product/service, or to cause disruption in the recommender system. Attacks can cause the recommender system to become unreliable and untrustworthy, resulting in user dissatisfaction. Developers already face tradeoffs in system efficiency and accuracy, and designing for robustness adds an additional dimension for consideration. In this paper, we show how the underlying implementation choices for item-based and user-based Collaborative Filtering recommender systems can affect the accuracy and robustness of recommender systems. We also show how accuracy and robustness can change over a systemโ€™s lifetime by analyzing a set of temporal snapshots from system usage over time. Results provide insight into some of the tradeoffs between robustness and accuracy that operators may need to consider in development and evaluation.


A Comparative Study of Collaborative Filtering Algorithms

arXiv.org Machine Learning

Collaborative filtering is a rapidly advancing research area. Every year several new techniques are proposed and yet it is not clear which of the techniques work best and under what conditions. In this paper we conduct a study comparing several collaborative filtering techniques -- both classic and recent state-of-the-art -- in a variety of experimental contexts. Specifically, we report conclusions controlling for number of items, number of users, sparsity level, performance criteria, and computational complexity. Our conclusions identify what algorithms work well and in what conditions, and contribute to both industrial deployment collaborative filtering algorithms and to the research community.


Quantitative Concept Analysis

arXiv.org Artificial Intelligence

Formal Concept Analysis (FCA) begins from a context, given as a binary relation between some objects and some attributes, and derives a lattice of concepts, where each concept is given as a set of objects and a set of attributes, such that the first set consists of all objects that satisfy all attributes in the second, and vice versa. Many applications, though, provide contexts with quantitative information, telling not just whether an object satisfies an attribute, but also quantifying this satisfaction. Contexts in this form arise as rating matrices in recommender systems, as occurrence matrices in text analysis, as pixel intensity matrices in digital image processing, etc. Such applications have attracted a lot of attention, and several numeric extensions of FCA have been proposed. We propose the framework of proximity sets (proxets), which subsume partially ordered sets (posets) as well as metric spaces. One feature of this approach is that it extracts from quantified contexts quantified concepts, and thus allows full use of the available information. Another feature is that the categorical approach allows analyzing any universal properties that the classical FCA and the new versions may have, and thus provides structural guidance for aligning and combining the approaches.


Objective Function Designing Led by User Preferences Acquisition

arXiv.org Artificial Intelligence

Many real world problems can be defined as optimisation problems in which the aim is to maximise an objective function. The quality of obtained solution is directly linked to the pertinence of the used objective function. However, designing such function, which has to translate the user needs, is usually fastidious. In this paper, a method to help user objective functions designing is proposed. Our approach, which is highly interactive, is based on man machine dialogue and more particularly on the comparison of problem instance solutions by the user. We propose an experiment in the domain of cartographic generalisation that shows promising results.


Leveraging Usage Data for Linked Data Movie Entity Summarization

arXiv.org Artificial Intelligence

Novel research in the field of Linked Data focuses on the problem of entity summarization. This field addresses the problem of ranking features according to their importance for the task of identifying a particular entity. Next to a more human friendly presentation, these summarizations can play a central role for semantic search engines and semantic recommender systems. In current approaches, it has been tried to apply entity summarization based on patterns that are inherent to the regarded data. The proposed approach of this paper focuses on the movie domain. It utilizes usage data in order to support measuring the similarity between movie entities. Using this similarity it is possible to determine the k-nearest neighbors of an entity. This leads to the idea that features that entities share with their nearest neighbors can be considered as significant or important for these entities. Additionally, we introduce a downgrading factor (similar to TF-IDF) in order to overcome the high number of commonly occurring features. We exemplify the approach based on a movie-ratings dataset that has been linked to Freebase entities.


Web Resources Recommendation based on Dynamic Prediction of User Consumption on the Social Web

AAAI Conferences

The Web is a giant repository of resources (Service and content), where Discovery and Recommendation systems are used to deliver the best ranked list of relevant web resources that meet user requirements. Nowadays, these systems are based on the simulation and automation of the user search criteria, considering the relation between consumption trends and the different kinds of usersโ€™ relationships with their virtual and physical environment, based on the information from the Social Web and mobile device sensors among others. These systems are executed once an explicit query of the user has been received; however, there are resources that are useful in specific situations, where these resources have high probability to be consumed, but, due to absence of a query they are not recommended to the users. In this regard, the question is: how to make a successful Web Resource Recommendation without the user query? In order to answer the question, this research proposal presents a novel approach to Recommend Web Resources based on Dynamic Prediction of User Consumption on the Social Web, which emulates the user behavior, the resource dynamism and the context opportunities, in real time, catching the best situations to make an asynchronous (unexpected by the user) recommendation of a useful Resources; and boost Web Resources consumption.


Personalisation of Social Web Services in the Enterprise Using Spreading Activation for Multi-Source, Cross-Domain Recommendations

AAAI Conferences

Existing personalisation approaches, such as collaborative filtering or content based recommendations, are highly dependent on the domain and/or the source of the data. Therefore, there is a need for more accurate means to capture and model the interests of the user across domains, and to interlink them in a semantically-enhanced interest graph. We propose a new approach for multi-source, cross-genre recommendations that can exploit the heterogeneous nature of user profile data, which has been aggregated from multiple personalised web services, such as blogs, wikis and microblogs. Our approach is based on the Spreading Activation model that exploits intrinsic links between entities across a number of data sources. The proposed method is highly customizable and applicable both to generic and specific recommendation scenarios and use cases. With the growing number of Social Web applications in the enterprise (blogs, wikis, micro blogging, etc.), it becomes difficult for knowledge workers to avoid content overload and to quickly identify relevant people, communities and information. We demonstrate the application of our approach in an industrial use case that involves recommendation of social semantic data across multiple services in a distributed collaborative environment.


Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms

arXiv.org Machine Learning

Contextual bandit algorithms have become popular for online recommendation systems such as Digg, Yahoo! Buzz, and news recommendation in general. \emph{Offline} evaluation of the effectiveness of new algorithms in these applications is critical for protecting online user experiences but very challenging due to their "partial-label" nature. Common practice is to create a simulator which simulates the online environment for the problem at hand and then run an algorithm against this simulator. However, creating simulator itself is often difficult and modeling bias is usually unavoidably introduced. In this paper, we introduce a \emph{replay} methodology for contextual bandit algorithm evaluation. Different from simulator-based approaches, our method is completely data-driven and very easy to adapt to different applications. More importantly, our method can provide provably unbiased evaluations. Our empirical results on a large-scale news article recommendation dataset collected from Yahoo! Front Page conform well with our theoretical results. Furthermore, comparisons between our offline replay and online bucket evaluation of several contextual bandit algorithms show accuracy and effectiveness of our offline evaluation method.


A Contextual-Bandit Approach to Personalized News Article Recommendation

arXiv.org Artificial Intelligence

Personalized web services strive to adapt their services (advertisements, news articles, etc) to individual users by making use of both content and user information. Despite a few recent advances, this problem remains challenging for at least two reasons. First, web service is featured with dynamically changing pools of content, rendering traditional collaborative filtering methods inapplicable. Second, the scale of most web services of practical interest calls for solutions that are both fast in learning and computation. In this work, we model personalized recommendation of news articles as a contextual bandit problem, a principled approach in which a learning algorithm sequentially selects articles to serve users based on contextual information about the users and articles, while simultaneously adapting its article-selection strategy based on user-click feedback to maximize total user clicks. The contributions of this work are three-fold. First, we propose a new, general contextual bandit algorithm that is computationally efficient and well motivated from learning theory. Second, we argue that any bandit algorithm can be reliably evaluated offline using previously recorded random traffic. Finally, using this offline evaluation method, we successfully applied our new algorithm to a Yahoo! Front Page Today Module dataset containing over 33 million events. Results showed a 12.5% click lift compared to a standard context-free bandit algorithm, and the advantage becomes even greater when data gets more scarce.