seed user
Co-exposure Maximization in Online Social Networks
Social media has created new ways for citizens to stay informed on societal matters and participate in political discourse. However, with its algorithmically-curated and virally-propagating content, social media has contributed further to the polarization of opinions by reinforcing users' existing viewpoints. An emerging line of research seeks to understand how content-recommendation algorithms can be re-designed to mitigate societal polarization amplified by social-media interactions. In this paper, we study the problem of allocating seed users to opposing campaigns: by drawing on the equal-time rule of political campaigning on traditional media, our goal is to allocate seed users to campaigners with the aim to maximize the expected number of users who are co-exposed to both campaigns. We show that the problem of maximizing co-exposure is NP-hard and its objective function is neither submodular nor supermodular. However, by exploiting a connection to a submodular function that acts as a lower bound to the objective, we are able to devise a greedy algorithm with provable approximation guarantee. We further provide a scalable instantiation of our approximation algorithm by introducing a novel extension to the notion of random reverse-reachable sets for efficiently estimating the expected co-exposure. We experimentally demonstrate the quality of our proposal on real-world social networks.
Diffusion Model Agnostic Social Influence Maximization in Hyperbolic Space
The Influence Maximization (IM) problem aims to find a small set of influential users to maximize their influence spread in a social network. Traditional methods rely on fixed diffusion models with known parameters, limiting their generalization to real-world scenarios. In contrast, graph representation learning-based methods have gained wide attention for overcoming this limitation by learning user representations to capture influence characteristics. However, existing studies are built on Euclidean space, which fails to effectively capture the latent hierarchical features of social influence distribution. As a result, users' influence spread cannot be effectively measured through the learned representations. To alleviate these limitations, we propose HIM, a novel diffusion model agnostic method that leverages hyperbolic representation learning to estimate users' potential influence spread from social propagation data. HIM consists of two key components. First, a hyperbolic influence representation module encodes influence spread patterns from network structure and historical influence activations into expressive hyperbolic user representations. Hence, the influence magnitude of users can be reflected through the geometric properties of hyperbolic space, where highly influential users tend to cluster near the space origin. Second, a novel adaptive seed selection module is developed to flexibly and effectively select seed users using the positional information of learned user representations. Extensive experiments on five network datasets demonstrate the superior effectiveness and efficiency of our method for the IM problem with unknown diffusion model parameters, highlighting its potential for large-scale real-world social networks.
Co-exposure Maximization in Online Social Networks
Social media has created new ways for citizens to stay informed on societal matters and participate in political discourse. However, with its algorithmically-curated and virally-propagating content, social media has contributed further to the polarization of opinions by reinforcing users' existing viewpoints. An emerging line of research seeks to understand how content-recommendation algorithms can be re-designed to mitigate societal polarization amplified by social-media interactions. In this paper, we study the problem of allocating seed users to opposing campaigns: by drawing on the equal-time rule of political campaigning on traditional media, our goal is to allocate seed users to campaigners with the aim to maximize the expected number of users who are co-exposed to both campaigns. We show that the problem of maximizing co-exposure is NP-hard and its objective function is neither submodular nor supermodular.
Reframing Audience Expansion through the Lens of Probability Density Estimation
Audience expansion is a methodology developed by ad-serving platforms to help advertisers find the best-matched audiences for their ads without looking into audience specifics. The rationale is that if you advertise to people who are similar to ones who already like the product or service you want to sell, chances are the conversion rate will be high. By leveraging this methodology advertisers can effortlessly reach their ideal leads by simply uploading a list of reference individuals, also known as a seed audience, to the platform. Then, the platform expands this seed to an audience of the desired size, typically resulting in a significant reduction in customer acquisition costs compared to other targeting strategies. From a machine learning perspective, a sound strategy for expanding a seed audience is by framing the problem as a binary classification task [Qu et al., 2014, Shen et al., 2015, Liu et al., 2016, Ma et al., 2016b,a]. Essentially, this involves creating a two-class labeled training set, consisting of seed users and non-seed users, and then training a probabilistic classifier, e.g., Logistic Regression [Jiang et al., 2019], to distinguish between the two classes. But instead of generating class predictions, the goal is to estimate the conditional probability that a given user belongs to the positive class. This probability is used to prioritize users for the expanded audience.
Exploring 360-Degree View of Customers for Lookalike Modeling
Rahman, Md Mostafizur, Kikuta, Daisuke, Abrol, Satyen, Hirate, Yu, Suzumura, Toyotaro, Loyola, Pablo, Ebisu, Takuma, Kondapaka, Manoj
Lookalike models are based on the assumption that user similarity plays an important role towards product selling and enhancing the existing advertising campaigns from a very large user base. Challenges associated to these models reside on the heterogeneity of the user base and its sparsity. In this work, we propose a novel framework that unifies the customers different behaviors or features such as demographics, buying behaviors on different platforms, customer loyalty behaviors and build a lookalike model to improve customer targeting for Rakuten Group, Inc. Extensive experiments on real e-commerce and travel datasets demonstrate the effectiveness of our proposed lookalike model for user targeting task.
Real-time Attention Based Look-alike Model for Recommender System
Liu, Yudan, Ge, Kaikai, Zhang, Xu, Lin, Leyu
Recently, deep learning models play more and more important roles in contents recommender systems. However, although the performance of recommendations is greatly improved, the "Matthew effect" becomes increasingly evident. While the head contents get more and more popular, many competitive long-tail contents are difficult to achieve timely exposure because of lacking behavior features. This issue has badly impacted the quality and diversity of recommendations. To solve this problem, look-alike algorithm is a good choice to extend audience for high quality long-tail contents. But the traditional look-alike models which widely used in online advertising are not suitable for recommender systems because of the strict requirement of both real-time and effectiveness. This paper introduces a real-time attention based look-alike model (RALM) for recommender systems, which tackles the challenge of conflict between real-time and effectiveness. RALM realizes real-time look-alike audience extension benefiting from seeds-to-user similarity prediction and improves the effectiveness through optimizing user representation learning and look-alike learning modeling. For user representation learning, we propose a novel neural network structure named attention merge layer to replace the concatenation layer, which significantly improves the expressive ability of multi-fields feature learning. On the other hand, considering the various members of seeds, we design global attention unit and local attention unit to learn robust and adaptive seeds representation with respect to a certain target user. At last, we introduce seeds clustering mechanism which not only reduces the time complexity of attention units prediction but also minimizes the loss of seeds information at the same time. According to our experiments, RALM shows superior effectiveness and performance than popular look-alike models.
Catch the Black Sheep: Unified Framework for Shilling Attack Detection Based on Fraudulent Action Propagation
Zhang, Yongfeng (Tsinghua University) | Tan, Yunzhi (Tsinghua University) | Zhang, Min (Tsinghua University) | Liu, Yiqun (Tsinghua University) | Chua, Tat-Seng (National University of Singapore) | Ma, Shaoping (Tsinghua University)
Many e-commerce systems allow users to express their opinions towards products through user reviews systems. The user generated reviews not only help other users to gain a more insightful view of the products, but also help online businesses to make targeted improvements on the products or services. Besides, they compose the key component of various personalized recommender systems. However, the existence of spam user accounts in the review systems introduce unfavourable disturbances into personalized recommendation by promoting or degrading targeted items intentionally through fraudulent reviews. Previous shilling attack detection algorithms usually deal with a specific kind of attacking strategy, and are exhausted to handle with the continuously emerging new cheating methods. In this work, we propose to conduct shilling attack detection for more informed recommendation by fraudulent action propagation on the reviews themselves, without caring about the specific underlying cheating strategy, which allows us a unified and flexible framework to detect the spam users.
Aggregating Content and Network Information to Curate Twitter User Lists
Greene, Derek, Sheridan, Gavin, Smyth, Barry, Cunningham, Pádraig
Twitter introduced user lists in late 2009, allowing users to be grouped according to meaningful topics or themes. Lists have since been adopted by media outlets as a means of organising content around news stories. Thus the curation of these lists is important - they should contain the key information gatekeepers and present a balanced perspective on a story. Here we address this list curation process from a recommender systems perspective. We propose a variety of criteria for generating user list recommendations, based on content analysis, network analysis, and the "crowdsourcing" of existing user lists. We demonstrate that these types of criteria are often only successful for datasets with certain characteristics. To resolve this issue, we propose the aggregation of these different "views" of a news story on Twitter to produce more accurate user recommendations to support the curation process.