Goto

Collaborating Authors

 Zielona Góra


Does Twitter know your political views? POLiTweets dataset and semi-automatic method for political leaning discovery

arXiv.org Artificial Intelligence

Every day, the world is flooded by millions of messages and statements posted on Twitter or Facebook. Social media platforms try to protect users' personal data, but there still is a real risk of misuse, including elections manipulation. Did you know, that only 13 posts addressing important or controversial topics for society are enough to predict one's political affiliation with a 0.85 F1-score? To examine this phenomenon, we created a novel universal method of semi-automated political leaning discovery. It relies on a heuristical data annotation procedure, which was evaluated to achieve 0.95 agreement with human annotators (counted as an accuracy metric). We also present POLiTweets - the first publicly open Polish dataset for political affiliation discovery in a multi-party setup, consisting of over 147k tweets from almost 10k Polish-writing users annotated heuristically and almost 40k tweets from 166 users annotated manually as a test set. We used our data to study the aspects of domain shift in the context of topics and the type of content writers - ordinary citizens vs. professional politicians.


Machine Learning the Phenomenology of COVID-19 From Early Infection Dynamics

arXiv.org Machine Learning

We present a robust data-driven machine learning analysis of the COVID-19 pandemic from its early infection dynamics, specifically infection counts over time. The goal is to extract actionable public health insights. These insights include the infectious force, the rate of a mild infection becoming serious, estimates for asymtomatic infections and predictions of new infections over time. We focus on USA data starting from the first confirmed infection on January 20 2020. Our methods reveal significant asymptomatic (hidden) infection, a lag of about 10 days, and we quantitatively confirm that the infectious force is strong with about a 0.14% transition from mild to serious infection. Our methods are efficient, robust and general, being agnostic to the specific virus and applicable to different populations or cohorts.