Goto

Collaborating Authors

 twitter user


DISHONEST: Dissecting misInformation Spread using Homogeneous sOcial NEtworks and Semantic Topic classification

Stam, Caleb, Saldanha, Emily, Halappanavar, Mahantesh, Acharya, Anurag

arXiv.org Artificial Intelligence

The emergence of the COVID-19 pandemic resulted in a significant rise in the spread of misinformation on online platforms such as Twitter. Oftentimes this growth is blamed on the idea of the "echo chamber." However, the behavior said to characterize these echo chambers exists in two dimensions. The first is in a user's social interactions, where they are said to stick with the same clique of like-minded users. The second is in the content of their posts, where they are said to repeatedly espouse homogeneous ideas. In this study, we link the two by using Twitter's network of retweets to study social interactions and topic modeling to study tweet content. In order to measure the diversity of a user's interactions over time, we develop a novel metric to track the speed at which they travel through the social network. The application of these analysis methods to misinformation-focused data from the pandemic demonstrates correlation between social behavior and tweet content. We believe this correlation supports the common intuition about how antisocial users behave, and further suggests that it holds even in subcommunities already rife with misinformation.


Do We Trust What They Say or What They Do? A Multimodal User Embedding Provides Personalized Explanations

Ren, Zhicheng, Xiao, Zhiping, Sun, Yizhou

arXiv.org Artificial Intelligence

With the rapid development of social media, the importance of analyzing social network user data has also been put on the agenda. User representation learning in social media is a critical area of research, based on which we can conduct personalized content delivery, or detect malicious actors. Being more complicated than many other types of data, social network user data has inherent multimodal nature. Various multimodal approaches have been proposed to harness both text (i.e. post content) and relation (i.e. inter-user interaction) information to learn user embeddings of higher quality. The advent of Graph Neural Network models enables more end-to-end integration of user text embeddings and user interaction graphs in social networks. However, most of those approaches do not adequately elucidate which aspects of the data - text or graph structure information - are more helpful for predicting each specific user under a particular task, putting some burden on personalized downstream analysis and untrustworthy information filtering. We propose a simple yet effective framework called Contribution-Aware Multimodal User Embedding (CAMUE) for social networks. We have demonstrated with empirical evidence, that our approach can provide personalized explainable predictions, automatically mitigating the impact of unreliable information. We also conducted case studies to show how reasonable our results are. We observe that for most users, graph structure information is more trustworthy than text information, but there are some reasonable cases where text helps more. Our work paves the way for more explainable, reliable, and effective social media user embedding which allows for better personalized content delivery.


Global Public Sentiment on Decentralized Finance: A Spatiotemporal Analysis of Geo-tagged Tweets from 150 Countries

Chen, Yuqi, Li, Yifan, Zhou, Kyrie Zhixuan, Fu, Xiaokang, Liu, Lingbo, Bao, Shuming, Sui, Daniel, Zhang, Luyao

arXiv.org Machine Learning

In the digital era, blockchain technology, cryptocurrencies, and non-fungible tokens (NFTs) have transformed financial and decentralized systems. However, existing research often neglects the spatiotemporal variations in public sentiment toward these technologies, limiting macro-level insights into their global impact. This study leverages Twitter data to explore public attention and sentiment across 150 countries, analyzing over 150 million geotagged tweets from 2012 to 2022. Sentiment scores were derived using a BERT-based multilingual sentiment model trained on 7.4 billion tweets. The analysis integrates global cryptocurrency regulations and economic indicators from the World Development Indicators database. Results reveal significant global sentiment variations influenced by economic factors, with more developed nations engaging more in discussions, while less developed countries show higher sentiment levels. Geographically weighted regression indicates that GDP-tweet engagement correlation intensifies following Bitcoin price surges. Topic modeling shows that countries within similar economic clusters share discussion trends, while different clusters focus on distinct topics. This study highlights global disparities in sentiment toward decentralized finance, shaped by economic and regional factors, with implications for poverty alleviation, cryptocurrency crime, and sustainable development. The dataset and code are publicly available on GitHub.


Incentivizing News Consumption on Social Media Platforms Using Large Language Models and Realistic Bot Accounts

Askari, Hadi, Chhabra, Anshuman, von Hohenberg, Bernhard Clemm, Heseltine, Michael, Wojcieszak, Magdalena

arXiv.org Artificial Intelligence

Polarization, declining trust, and wavering support for democratic norms are pressing threats to U.S. democracy. Exposure to verified and quality news may lower individual susceptibility to these threats and make citizens more resilient to misinformation, populism, and hyperpartisan rhetoric. This project examines how to enhance users' exposure to and engagement with verified and ideologically balanced news in an ecologically valid setting. We rely on a large-scale two-week long field experiment (from 1/19/2023 to 2/3/2023) on 28,457 Twitter users. We created 28 bots utilizing GPT-2 that replied to users tweeting about sports, entertainment, or lifestyle with a contextual reply containing two hardcoded elements: a URL to the topic-relevant section of quality news organization and an encouragement to follow its Twitter account. To further test differential effects by gender of the bots, treated users were randomly assigned to receive responses by bots presented as female or male. We examine whether our over-time intervention enhances the following of news media organization, the sharing and the liking of news content and the tweeting about politics and the liking of political content. We find that the treated users followed more news accounts and the users in the female bot treatment were more likely to like news content than the control. Most of these results, however, were small in magnitude and confined to the already politically interested Twitter users, as indicated by their pre-treatment tweeting about politics. These findings have implications for social media and news organizations, and also offer direction for future work on how Large Language Models and other computational interventions can effectively enhance individual on-platform engagement with quality news and public affairs.


DADIT: A Dataset for Demographic Classification of Italian Twitter Users and a Comparison of Prediction Methods

Lupo, Lorenzo, Bose, Paul, Habibi, Mahyar, Hovy, Dirk, Schwarz, Carlo

arXiv.org Artificial Intelligence

Social scientists increasingly use demographically stratified social media data to study the attitudes, beliefs, and behavior of the general public. To facilitate such analyses, we construct, validate, and release publicly the representative DADIT dataset of 30M tweets of 20k Italian Twitter users, along with their bios and profile pictures. We enrich the user data with high-quality labels for gender, age, and location. DADIT enables us to train and compare the performance of various state-of-the-art models for the prediction of the gender and age of social media users. In particular, we investigate if tweets contain valuable information for the task, since popular classifiers like M3 don't leverage them. Our best XLM-based classifier improves upon the commonly used competitor M3 by up to 53% F1. Especially for age prediction, classifiers profit from including tweets as features. We also confirm these findings on a German test set.


Hands on with Windows 11's AI-boosted Automatic Super Resolution

PCWorld

Microsoft appears to have hidden its next AI-powered enhancement within a test build of Windows 11: Automatic Super Resolution, which could be Microsoft's answer to image-enhancement technologies like Nvidia's DLSS. It seems just as likely, however, to be an overarching control, like the Dynamic Lighting feature. Twitter user @PhantomofEarth was among the first to report the new feature this week, hidden away in Windows 11 Build 26052. That build right now appears in both the Canary and Dev Channels of Microsoft's Windows 11 Insider channels, and comes with the caveat that Microsoft may never release this specific feature to the general public. Given its focus on AI, though, it appears likely.


AI incidents and 'networked trouble': The case for a research agenda

Shane, Tommy Shaffer

arXiv.org Artificial Intelligence

Against a backdrop of widespread interest in how publics can participate in the design of AI, I argue for a research agenda focused on AI incidents - examples of AI going wrong and sparking controversy - and how they are constructed in online environments. I take up the example of an AI incident from September 2020, when a Twitter user created a 'horrible experiment' to demonstrate the racist bias of Twitter's algorithm for cropping images. This resulted in Twitter not only abandoning its use of that algorithm, but also disavowing its decision to use any algorithm for the task. I argue that AI incidents like this are a significant means for participating in AI systems that require further research. That research agenda, I argue, should focus on how incidents are constructed through networked online behaviours that I refer to as 'networked trouble', where formats for participation enable individuals and algorithms to interact in ways that others - including technology companies - come to know and come to care about. At stake, I argue, is an important mechanism for participating in the design and deployment of AI.


Characteristics and prevalence of fake social media profiles with AI-generated faces

Yang, Kai-Cheng, Singh, Danishjeet, Menczer, Filippo

arXiv.org Artificial Intelligence

Recent advancements in generative artificial intelligence (AI) have raised concerns about their potential to create convincing fake social media accounts, but empirical evidence is lacking. In this paper, we present a systematic analysis of Twitter(X) accounts using human faces generated by Generative Adversarial Networks (GANs) for their profile pictures. We present a dataset of 1,353 such accounts and show that they are used to spread scams, spam, and amplify coordinated messages, among other inauthentic activities. Leveraging a feature of GAN-generated faces -- consistent eye placement -- and supplementing it with human annotation, we devise an effective method for identifying GAN-generated profiles in the wild. Applying this method to a random sample of active Twitter users, we estimate a lower bound for the prevalence of profiles using GAN-generated faces between 0.021% and 0.044% -- around 10K daily active accounts. These findings underscore the emerging threats posed by multimodal generative AI. We release the source code of our detection method and the data we collect to facilitate further investigation. Additionally, we provide practical heuristics to assist social media users in recognizing such accounts.


Graph Neural Networks for Antisocial Behavior Detection on Twitter

Toshevska, Martina, Kalajdziski, Slobodan, Gievska, Sonja

arXiv.org Artificial Intelligence

Social media resurgence of antisocial behavior has exerted a downward spiral on stereotypical beliefs, and hateful comments towards individuals and social groups, as well as false or distorted news. The advances in graph neural networks employed on massive quantities of graph-structured data raise high hopes for the future of mediating communication on social media platforms. An approach based on graph convolutional data was employed to better capture the dependencies between the heterogeneous types of data. Utilizing past and present experiences on the topic, we proposed and evaluated a graph-based approach for antisocial behavior detection, with general applicability that is both language- and context-independent. In this research, we carried out an experimental validation of our graph-based approach on several PAN datasets provided as part of their shared tasks, that enable the discussion of the results obtained by the proposed solution.


Empowering Fake-News Mitigation: Insights from Sharers' Social Media Post-Histories

Schoenmueller, Verena, Blanchard, Simon J., Johar, Gita V.

arXiv.org Artificial Intelligence

Misinformation is a global concern and limiting its spread is critical for protecting democracy, public health, and consumers. We propose that consumers' own social media post-histories are an underutilized data source to study what leads them to share links to fake-news. In Study 1, we explore how textual cues extracted from post-histories distinguish fake-news sharers from random social media users and others in the misinformation ecosystem. Among other results, we find across two datasets that fake-news sharers use more words related to anger, religion and power. In Study 2, we show that adding textual cues from post-histories improves the accuracy of models to predict who is likely to share fake-news. In Study 3, we provide a preliminary test of two mitigation strategies deduced from Study 1 - activating religious values and reducing anger - and find that they reduce fake-news sharing and sharing more generally. In Study 4, we combine survey responses with users' verified Twitter post-histories and show that using empowering language in a fact-checking browser extension ad increases download intentions. Our research encourages marketers, misinformation scholars, and practitioners to use post-histories to develop theories and test interventions to reduce the spread of misinformation.