Media
BPR: Bayesian Personalized Ranking from Implicit Feedback
Rendle, Steffen, Freudenthaler, Christoph, Gantner, Zeno, Schmidt-Thieme, Lars
Item recommendation is the task of predicting a personalized ranking on a set of items (e.g. websites, movies, products). In this paper, we investigate the most common scenario with implicit feedback (e.g. clicks, purchases). There are many methods for item recommendation from implicit feedback like matrix factorization (MF) or adaptive knearest-neighbor (kNN). Even though these methods are designed for the item prediction task of personalized ranking, none of them is directly optimized for ranking. In this paper we present a generic optimization criterion BPR-Opt for personalized ranking that is the maximum posterior estimator derived from a Bayesian analysis of the problem. We also provide a generic learning algorithm for optimizing models with respect to BPR-Opt. The learning method is based on stochastic gradient descent with bootstrap sampling. We show how to apply our method to two state-of-the-art recommender models: matrix factorization and adaptive kNN. Our experiments indicate that for the task of personalized ranking our optimization method outperforms the standard learning techniques for MF and kNN. The results show the importance of optimizing models for the right criterion.
A Novel Method For Speech Segmentation Based On Speakers' Characteristics
Abdolali, Behrouz, Sameti, Hossein
Speech Segmentation is the process change point detection for partitioning an input audio stream into regions each of which corresponds to only one audio source or one speaker. One application of this system is in Speaker Diarization systems. There are several methods for speaker segmentation; however, most of the Speaker Diarization Systems use BIC-based Segmentation methods. The main goal of this paper is to propose a new method for speaker segmentation with higher speed than the current methods - e.g. BIC - and acceptable accuracy. Our proposed method is based on the pitch frequency of the speech. The accuracy of this method is similar to the accuracy of common speaker segmentation methods. However, its computation cost is much less than theirs. We show that our method is about 2.4 times faster than the BIC-based method, while the average accuracy of pitch-based method is slightly higher than that of the BIC-based method.
The Discrete Infinite Logistic Normal Distribution
Paisley, John, Wang, Chong, Blei, David
We present the discrete infinite logistic normal distribution (DILN), a Bayesian nonparametric prior for mixed membership models. DILN is a generalization of the hierarchical Dirichlet process (HDP) that models correlation structure between the weights of the atoms at the group level. We derive a representation of DILN as a normalized collection of gamma-distributed random variables, and study its statistical properties. We consider applications to topic modeling and derive a variational inference algorithm for approximate posterior inference. We study the empirical performance of the DILN topic model on four corpora, comparing performance with the HDP and the correlated topic model (CTM). To deal with large-scale data sets, we also develop an online inference algorithm for DILN and compare with online HDP and online LDA on the Nature magazine, which contains approximately 350,000 articles.
Leveraging Usage Data for Linked Data Movie Entity Summarization
Thalhammer, Andreas, Toma, Ioan, Roa-Valverde, Antonio, Fensel, Dieter
Novel research in the field of Linked Data focuses on the problem of entity summarization. This field addresses the problem of ranking features according to their importance for the task of identifying a particular entity. Next to a more human friendly presentation, these summarizations can play a central role for semantic search engines and semantic recommender systems. In current approaches, it has been tried to apply entity summarization based on patterns that are inherent to the regarded data. The proposed approach of this paper focuses on the movie domain. It utilizes usage data in order to support measuring the similarity between movie entities. Using this similarity it is possible to determine the k-nearest neighbors of an entity. This leads to the idea that features that entities share with their nearest neighbors can be considered as significant or important for these entities. Additionally, we introduce a downgrading factor (similar to TF-IDF) in order to overcome the high number of commonly occurring features. We exemplify the approach based on a movie-ratings dataset that has been linked to Freebase entities.
Sifu: Interactive Crowd-Assisted Language Learning
Chan, Cheng-wei (National Taiwan University) | Hsu, Jane Yung-jen ( National Taiwan University )
This paper introduces SIFU, a system that recruits in real time native speakers as online volunteer tutors to help answer questions from Chinese language learners in reading news articles. SIFU integrates the strengths of two effective online language learning methods: reading online news and communicating with online native speakers. SIFU recruits volunteers from an online social network rather than recruits workers from Amazon Mechanical Turk.Initial experiments showed that the proposed approach is able to effectively recruit online volunteer tutors, adequately answer the learners' questions, and efficiently obtain an answer for the learner. Our field deployment illustrates that SIFU is very useful in assisting Chinese learners in reading Chinese news articles and online volunteer tutors are willing to help Chinese learners when they are on social network service.
Using Web Services and Policies within a Social Platform to Support Collaborative Research
Pignotti, Edoardo (University of Aberdeen) | Edwards, Peter (University of Abeerdeen)
In this paper we present an architecture for provenance policies which can be used to describe and enact behavioural constraints in a system in order to ensure compliance with user and organisational policies. We discuss how this architecture has been used in order to manage the behaviour of the services powering an existing virtual research environment while reasoning about the relationships between users, their social network, their roles in a project, their groups and the provenance of research data.
Personalisation of Social Web Services in the Enterprise Using Spreading Activation for Multi-Source, Cross-Domain Recommendations
Heitmann, Benjamin (National University of Ireland, Galway) | Dabrowski, Maciej (National University of Ireland, Galway) | Passant, Alexandre (National University of Ireland, Galway) | Hayes, Conor (National University of Ireland, Galway) | Griffin, Keith (Cisco Systems)
Existing personalisation approaches, such as collaborative filtering or content based recommendations, are highly dependent on the domain and/or the source of the data. Therefore, there is a need for more accurate means to capture and model the interests of the user across domains, and to interlink them in a semantically-enhanced interest graph. We propose a new approach for multi-source, cross-genre recommendations that can exploit the heterogeneous nature of user profile data, which has been aggregated from multiple personalised web services, such as blogs, wikis and microblogs. Our approach is based on the Spreading Activation model that exploits intrinsic links between entities across a number of data sources. The proposed method is highly customizable and applicable both to generic and specific recommendation scenarios and use cases. With the growing number of Social Web applications in the enterprise (blogs, wikis, micro blogging, etc.), it becomes difficult for knowledge workers to avoid content overload and to quickly identify relevant people, communities and information. We demonstrate the application of our approach in an industrial use case that involves recommendation of social semantic data across multiple services in a distributed collaborative environment.
SNARE: Social Network Analysis and Reasoning Environment
Riecken, Doug (Columbia University) | Raja, Anita (University of North Carolina Charlotte/Columbia University) | Passonneau, Rebecca J. (Columbia University) | Waltz, David L. (Columbia University)
The importance of diversity in reasoning and learning to successfully address complex problems is examined. We discuss an approach by which a multiagent framework with decentralized control mechanisms provides diverse perspectives and hypotheses addressing a class of complex problems. We introduce the SNARE multiagent system. SNARE performs tasks to gain situational awareness of situations of interest in a Social Media Space. It applies a decentralized control mechanism for each agent; this mechanism enables an agent to interact with other agents to reason and learn. This approach facilitates dynamic agent organizations that adapt the topologies of interactions between agents based on the problem context.
The Pulse of News in Social Media: Forecasting Popularity
Bandari, Roja (University of California Los Angeles) | Asur, Sitaram (HP Labs) | Huberman, Bernardo A (HP Labs)
News articles are extremely time sensitive by nature. There is also intense competition among news items to propagate as widely as possible. Hence, the task of predicting the popularity of news items on the social web is both interesting and challenging. Prior research has dealt with predicting eventual online popularity based on early popularity. It is most desirable, however, to predict the popularity of items prior to their release, fostering the possibility of appropriate decision making to modify an article and the manner of its publication. In this paper, we construct a multi-dimensional feature space derived from properties of an article and evaluate the efficacy of these features to serve as predictors of online popularity. We examine both regression and classification algorithms and demonstrate that despite randomness in human behavior, it is possible to predict ranges of popularity on twitter with an overall 84% accuracy. Our study also serves to illustrate the differences between traditionally prominent sources and those immensely popular on the social web.
Social Media and Citizen Engagement in a City-State: A Study of Singapore
Skoric, Marko M. (Nanyang Technological University) | Pan, Ji (Nanyang Technological University) | Poor, Nathaniel D (Independent Scholar)
Social media plays an important role in the process of political engagement, especially in societies where significant constraints over traditional media and participation still exist. Little is known about how social media use is related to these constraints. This study examines how citizens’ perceptions of government control predict social media use and how this use is related to offline participation in the context of a city-state, Singapore. Based on a national survey of 2000 respondents, we found that perceptions of control over traditional media and political activity increase content production on social media and that perceived control of the mass media motivates citizens to consume political content on social media. Interestingly, perceptions of government control over the Internet reduced rather than increased social media production. More importantly, we find that social media use is related to a greater likelihood of offline citizen participation, namely attendance of political rallies. The findings suggest that social media alters the balance of power in the dependency relationships that exist between the government, media organizations and citizens, creating new venues for online political discourse which in turn help promote real-world political participation.