Goto

Collaborating Authors

 Kooti, Farshad


Ensemble Validation: Selectivity has a Price, but Variety is Free

arXiv.org Machine Learning

If classifiers are selected from a hypothesis class to form an ensemble, bounds on average error rate over the selected classifiers include a component for selectivity, which grows as the fraction of hypothesis classifiers selected for the ensemble shrinks, and a component for variety, which grows with the size of the hypothesis class or in-sample data set. We show that the component for selectivity asymptotically dominates the component for variety, meaning that variety is essentially free.


The DARPA Twitter Bot Challenge

arXiv.org Artificial Intelligence

A number of organizations ranging from terrorist groups such as ISIS to politicians and nation states reportedly conduct explicit campaigns to influence opinion on social media, posing a risk to democratic processes. There is thus a growing need to identify and eliminate "influence bots" - realistic, automated identities that illicitly shape discussion on sites like Twitter and Facebook - before they get too influential. Spurred by such events, DARPA held a 4-week competition in February/March 2015 in which multiple teams supported by the DARPA Social Media in Strategic Communications program competed to identify a set of previously identified "influence bots" serving as ground truth on a specific topic within Twitter. Past work regarding influence bots often has difficulty supporting claims about accuracy, since there is limited ground truth (though some exceptions do exist [3,7]). However, with the exception of [3], no past work has looked specifically at identifying influence bots on a specific topic. This paper describes the DARPA Challenge and describes the methods used by the three top-ranked teams.


The Emergence of Conventions in Online Social Networks

AAAI Conferences

The way in which social conventions emerge in communities has been of interest to social scientists for decades. Here we report on the emergence of a particular social convention on Twitter—the way to indicate a tweet is being reposted and to attribute the content to its source. Initially, different variations were invented and spread through the Twitter network. The inventors and early adopters were well-connected, active, core members of the Twitter community. The diffusion networks of these conventions were dense and highly clustered, so no single user was critical to the adoption of the conventions. Despite being invented at different times and having different adoption rates, only two variations came to be widely adopted. In this paper we describe this process in detail, highlighting insights and raising questions about how social conventions emerge.