Sidelines: An Algorithm for Increasing Diversity in News and Opinion Aggregators

AAAI Conferences

Aggregators rely on votes, and links to select and present subsets of the large quantity of news and opinion items generated each day. Opinion and topic diversity in the output sets can provide individual and societal benefits, but simply selecting the most popular items may not yield as much diversity as is present in the overall pool of votes and links. In this paper, we define three diversity metrics that address different dimensions of diversity: inclusion, non-alienation, and proportional representation. We then present the Sidelines algorithm – which temporarily suppresses a voter’s preferences after a preferred item has been selected – as one approach to increase the diversity of result sets. In comparison to collections of the most popular items, from user votes on Digg.com and links from a panel of political blogs, the Sidelines algorithm increased inclusion while decreasing alienation. For the blog links, a set with known political preferences, we also found that Sidelines improved proportional representation. In an online experiment using blog link data as votes, readers were more likely to find something challenging to their views in the Sidelines result sets. These findings can help build news and opinion aggregators that present users with a broader range of topics and opinions.


Why Do You Spread This Message? Understanding Users Sentiment in Social Media Campaigns

AAAI Conferences

Twitter has been increasingly used for spreading messages about campaigns. Such campaigns try to gain followers through their Twitter accounts, influence the followers and spread messages through them. In this paper, we explore the relationship between followers’ sentiment towards the cam-paign topic and their rate of retweeting of messages gener-ated by the campaign. Our analysis with followers of mul-tiple social-media campaigns found statistical significant correlations between such sentiment and retweeting rate. Based on our analysis, we have conducted an online inter-vention study among the followers of different social-media campaigns. Our study shows that targeting followers based on their sentiment towards the campaign can give higher re-tweet rate than a number of other baseline approaches.


Encouraging Reading of Diverse Political Viewpoints with a Browser Widget

AAAI Conferences

The Internet gives individuals more choice in political news and information sources and more tools to filter out disagreeable information. Citing the preference described by selective exposure theory — people prefer information that supports their beliefs and avoid counter-attitudinal information — observers warn that people may use these tools to access only agreeable information and thus live in ideological echo chambers. We report on a field deployment of a browser extension that showed users feedback about the political lean of their weekly and all time reading behaviors. Compared to a control group, showing feedback led to a modest move toward balanced exposure, corresponding to 1-2 visits per week to ideologically opposing sites or 5-10 additional visits per week to centrist sites.


Going Negative Online? -- A Study of Negative Advertising on Social Media

arXiv.org Machine Learning

A growing number of empirical studies suggest that negative advertising is effective in campaigning, while the mechanisms are rarely mentioned. With the scandal of Cambridge Analytica and Russian intervention behind the Brexit and the 2016 presidential election, people have become aware of the political ads on social media and have pressured congress to restrict political advertising on social media. Following the related legislation, social media companies began disclosing their political ads archive for transparency during the summer of 2018 when the midterm election campaign was just beginning. This research collects the data of the related political ads in the context of the U.S. midterm elections since August to study the overall pattern of political ads on social media and uses sets of machine learning methods to conduct sentiment analysis on these ads to classify the negative ads. A novel approach is applied that uses AI image recognition to study the image data. Through data visualization, this research shows that negative advertising is still the minority, Republican advertisers and third party organizations are more likely to engage in negative advertising than their counterparts. Based on ordinal regressions, this study finds that anger evoked information-seeking is one of the main mechanisms causing negative ads to be more engaging and effective rather than the negative bias theory. Overall, this study provides a unique understanding of political advertising on social media by applying innovative data science methods. Further studies can extend the findings, methods, and datasets in this study, and several suggestions are given for future research.


Truth Inference at Scale: A Bayesian Model for Adjudicating Highly Redundant Crowd Annotations

arXiv.org Machine Learning

Crowd-sourcing is a cheap and popular means of creating training and evaluation datasets for machine learning, however it poses the problem of `truth inference', as individual workers cannot be wholly trusted to provide reliable annotations. Research into models of annotation aggregation attempts to infer a latent `true' annotation, which has been shown to improve the utility of crowd-sourced data. However, existing techniques beat simple baselines only in low redundancy settings, where the number of annotations per instance is low ($\le 3$), or in situations where workers are unreliable and produce low quality annotations (e.g., through spamming, random, or adversarial behaviours.) As we show, datasets produced by crowd-sourcing are often not of this type: the data is highly redundantly annotated ($\ge 5$ annotations per instance), and the vast majority of workers produce high quality outputs. In these settings, the majority vote heuristic performs very well, and most truth inference models underperform this simple baseline. We propose a novel technique, based on a Bayesian graphical model with conjugate priors, and simple iterative expectation-maximisation inference. Our technique produces competitive performance to the state-of-the-art benchmark methods, and is the only method that significantly outperforms the majority vote heuristic at one-sided level 0.025, shown by significance tests. Moreover, our technique is simple, is implemented in only 50 lines of code, and trains in seconds.