HP Labs
Exploiting Burstiness in Reviews for Review Spammer Detection
Fei, Geli (The University of Illinois at Chicago) | Mukherjee, Arjun (The University of Illinois at Chicago) | Liu, Bing (The University of Illinois at Chicago) | Hsu, Meichun (HP Labs) | Castellanos, Malu (HP Labs) | Ghosh, Riddhiman (HP Labs)
Online product reviews have become an important source of user opinions. Due to profit or fame, imposters have been writing deceptive or fake reviews to promote and/or to demote some target products or services. Such imposters are called review spammers. In the past few years, several approaches have been proposed to deal with the problem. In this work, we take a different approach, which exploits the burstiness nature of reviews to identify review spammers. Bursts of reviews can be either due to sudden popularity of products or spam attacks. Reviewers and reviews appearing in a burst are often related in the sense that spammers tend to work with other spammers and genuine reviewers tend to appear together with other genuine reviewers. This paves the way for us to build a network of reviewers appearing in different bursts. We then model reviewers and their co-occurrence in bursts as a Markov Random Field (MRF), and employ the Loopy Belief Propagation (LBP) method to infer whether a reviewer is a spammer or not in the graph. We also propose several features and employ feature induced message passing in the LBP framework for network inference. We further propose a novel evaluation method to evaluate the detected spammers automatically using supervised classification of their reviews. Additionally, we employ domain experts to perform a human evaluation of the identified spammers and non-spammers. Both the classification result and human evaluation result show that the proposed method outperforms strong baselines, which demonstrate the effectiveness of the method.
Fine-Grained Photovoltaic Output Prediction Using a Bayesian Ensemble
Chakraborty, Prithwish (Virginia Tech) | Marwah, Manish (HP Labs) | Arlitt, Martin (HP Labs) | Ramakrishnan, Naren ( Virginia Tech )
Local and distributed power generation is increasingly relianton renewable power sources, e.g., solar (photovoltaic or PV) andwind energy. The integration of such sources into the power grid ischallenging, however, due to their variable and intermittent energyoutput. To effectively use them on alarge scale, it is essential to be able to predict power generation at afine-grained level. We describe a novel Bayesian ensemble methodologyinvolving three diverse predictors. Each predictor estimates mixingcoefficients for integrating PV generation output profiles but capturesfundamentally different characteristics. Two of them employ classicalparameterized (naive Bayes) and non-parametric (nearest neighbor) methods tomodel the relationship between weather forecasts and PV output. The thirdpredictor captures the sequentiality implicit in PV generation and uses motifsmined from historical data to estimate the most likely mixture weights usinga stream prediction methodology. We demonstrate the success and superiority of ourmethods on real PV data from two locations that exhibit diverse weatherconditions. Predictions from our model can be harnessed to optimize schedulingof delay tolerant workloads, e.g., in a data center.
The Pulse of News in Social Media: Forecasting Popularity
Bandari, Roja (University of California Los Angeles) | Asur, Sitaram (HP Labs) | Huberman, Bernardo A (HP Labs)
News articles are extremely time sensitive by nature. There is also intense competition among news items to propagate as widely as possible. Hence, the task of predicting the popularity of news items on the social web is both interesting and challenging. Prior research has dealt with predicting eventual online popularity based on early popularity. It is most desirable, however, to predict the popularity of items prior to their release, fostering the possibility of appropriate decision making to modify an article and the manner of its publication. In this paper, we construct a multi-dimensional feature space derived from properties of an article and evaluate the efficacy of these features to serve as predictors of online popularity. We examine both regression and classification algorithms and demonstrate that despite randomness in human behavior, it is possible to predict ranges of popularity on twitter with an overall 84% accuracy. Our study also serves to illustrate the differences between traditionally prominent sources and those immensely popular on the social web.
AAAI 2008 Spring Symposia Reports
Balduccini, Marcello (Eastman Kodak Company) | Baral, Chitta (Arizona State University) | Brodaric, Boyan (Geological Survey of Canada) | Colton, Simon (Imperial College, London) | Fox, Peter (National Center for Atmospheric Research) | Gutelius, David (SRI International) | Hinkelmann, Knut (University of Applied Sciences Northwestern Switzerland) | Horswill, Ian (Northwestern University) | Huberman, Bernardo (HP Labs) | Hudlicka, Eva (Psychometrix Associates) | Lerman, Kristina (USC Information Sciences Institute) | Lisetti, Christine (Florida International University) | McGuinness, Deborah L. (Rensselaer Polytechnic Institute) | Maher, Mary Lou (National Science Foundation) | Musen, Mark A. (Stanford University) | Sahami, Mehran (Stanford University) | Sleeman, Derek (University of Aberdeen) | Thönssen, Barbara (University of Applied Sciences Northwestern Switzerland) | Velasquez, Juan D. (MIT CSAIL) | Ventura, Dan (Brigham Young University)
The titles of the eight symposia were as follows: (1) AI Meets Business Rules and Process Management, (2) Architectures for Intelligent Theory-Based Agents, (3) Creative Intelligent Systems, (4) Emotion, Personality, and Social Behavior, (5) Semantic Scientific Knowledge Integration, (6) Social Information Processing, (7) Symbiotic Relationships between Semantic Web and Knowledge Engineering, (8) Using AI to Motivate Greater Participation in Computer Science The goal of the AI Meets Business Rules and Process Management AAAI symposium was to investigate the various approaches and standards to represent business rules, business process management and the semantic web with respect to expressiveness and reasoning capabilities. The Semantic Scientific Knowledge Symposium was interested in bringing together the semantic technologies community with the scientific information technology community in an effort to build the general semantic science information community. The Social Information Processing's goal was to investigate computational and analytic approaches that will enable users to harness the efforts of large numbers of other users to solve a variety of information processing problems, from discovering high-quality content to managing common resources. The purpose of the Using AI to Motivate Greater Participation in Computer Science symposium was to identify ways that topics in AI may be used to motivate greater student participation in computer science by highlighting fun, engaging, and intellectually challenging developments in AI-related curriculum at a number of educational levels.
AAAI 2008 Spring Symposia Reports
Balduccini, Marcello (Eastman Kodak Company) | Baral, Chitta (Arizona State University) | Brodaric, Boyan (Geological Survey of Canada) | Colton, Simon (Imperial College, London) | Fox, Peter (National Center for Atmospheric Research) | Gutelius, David (SRI International) | Hinkelmann, Knut (University of Applied Sciences Northwestern Switzerland) | Horswill, Ian (Northwestern University) | Huberman, Bernardo (HP Labs) | Hudlicka, Eva (Psychometrix Associates) | Lerman, Kristina (USC Information Sciences Institute) | Lisetti, Christine (Florida International University) | McGuinness, Deborah L. (Rensselaer Polytechnic Institute) | Maher, Mary Lou (National Science Foundation) | Musen, Mark A. (Stanford University) | Sahami, Mehran (Stanford University) | Sleeman, Derek (University of Aberdeen) | Thönssen, Barbara (University of Applied Sciences Northwestern Switzerland) | Velasquez, Juan D. (MIT CSAIL) | Ventura, Dan (Brigham Young University)
The Association for the Advancement of Artificial Intelligence (AAAI) was pleased to present the AAAI 2008 Spring Symposium Series, held Wednesday through Friday, March 26–28, 2008 at Stanford University, California. The titles of the eight symposia were as follows: (1) AI Meets Business Rules and Process Management, (2) Architectures for Intelligent Theory-Based Agents, (3) Creative Intelligent Systems, (4) Emotion, Personality, and Social Behavior, (5) Semantic Scientific Knowledge Integration, (6) Social Information Processing, (7) Symbiotic Relationships between Semantic Web and Knowledge Engineering, (8) Using AI to Motivate Greater Participation in Computer Science The goal of the AI Meets Business Rules and Process Management AAAI symposium was to investigate the various approaches and standards to represent business rules, business process management and the semantic web with respect to expressiveness and reasoning capabilities. The focus of the Architectures for Intelligent Theory-Based Agents AAAI symposium was the definition of architectures for intelligent theory-based agents, comprising languages, knowledge representation methodologies, reasoning algorithms, and control loops. The Creative Intelligent Systems Symposium included five major discussion sessions and a general poster session (in which all contributing papers were presented). The purpose of this symposium was to explore the synergies between creative cognition and intelligent systems. The goal of the Emotion, Personality, and Social Behavior symposium was to examine fundamental issues in affect and personality in both biological and artificial agents, focusing on the roles of these factors in mediating social behavior. The Semantic Scientific Knowledge Symposium was interested in bringing together the semantic technologies community with the scientific information technology community in an effort to build the general semantic science information community. The Social Information Processing's goal was to investigate computational and analytic approaches that will enable users to harness the efforts of large numbers of other users to solve a variety of information processing problems, from discovering high-quality content to managing common resources. The goal of the Symbiotic Relationships between the Semantic Web and Software Engineering symposium was to explore how the lessons learned by the knowledge-engineering community over the past three decades could be applied to the bold research agenda of current workers in semantic web technologies. The purpose of the Using AI to Motivate Greater Participation in Computer Science symposium was to identify ways that topics in AI may be used to motivate greater student participation in computer science by highlighting fun, engaging, and intellectually challenging developments in AI-related curriculum at a number of educational levels. Technical reports of the symposia were published by AAAI Press.