AITopics

Managing large collections of documents is an important problem for many areas of science, industry, and culture. Probabilistic topic modeling offers a promising solution. Topic modeling is an unsupervised machine learning method that learns the underlying themes in a large collection of otherwise unorganized documents. This discovered structure summarizes and organizes the documents. However, topic models are high-level statistical tools—a user must scrutinize numerical distributions to understand and explore their results. In this paper, we present a method for visualizing topic models. Our method creates a navigator of the documents, allowing users to explore the hidden structure that a topic model discovers. These browsing interfaces reveal meaningful patterns in a collection, helping end-users explore and understand its contents in new ways. We provide open source software of our method.

artificial intelligence, corpus, natural language, (15 more...)

Sixth International AAAI Conference on Weblogs and Social Media

Country: North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Kıcıman, Emre (Microsoft Research)

OMG, I Have to Tweet that! A Study of Factors that Influence Tweet Rates

Many studies have shown that social data such as tweets are a rich source of information about the real-world including, for example, insights into health trends. A key limitation when analyzing Twitter data, however, is that it depends on people self-reporting their own behaviors and observations. In this paper, we present a large-scale quantitative analysis of some of the factors that influence self-reporting bias. In our study, we compare a year of tweets about weather events to ground-truth knowledge about actual weather occurrences. For each weather event we calculate how extreme, how expected, and how big a change the event represents. We calculate the extent to which these factors can explain the daily variations in tweet rates about weather events. We find that we can build global models that take into account basic weather information, together with extremeness, expectation and change calculations to account for over 40% of the variability in tweet rates. We build location-specific (i.e., a model per each metropolitan area) models that account for an average of 70% of the variability in tweet rates.

health & medicine, social media, tweet, (18 more...)

Sixth International AAAI Conference on Weblogs and Social Media

Country:

North America > United States > California (0.28)
North America > United States > District of Columbia > Washington (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Services (0.88)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.48)
Health & Medicine > Therapeutic Area > Immunology (0.47)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.34)

Social Media Is NOT that Bad! The Lexical Quality of Social Media

Rello, Luz (Universitat Pompeu Fabra) | Baeza-Yates, Ricardo (Yahoo! Research)

There is a strong correlation between spelling errors and web text content quality. Using our lexical quality measure, based in a small corpus of spelling errors, we present an estimation of the lexical quality of the main Social Media sites. This paper presents an updated and complete analysis of the lexical quality of Social Media written in English and Spanish, including how lexical quality changes in time.

artificial intelligence, social media, text processing, (15 more...)

Sixth International AAAI Conference on Weblogs and Social Media

Country:

North America > United States (0.15)
Europe > Spain (0.15)
Asia > India (0.14)

Genre: Research Report (0.47)

Industry: Information Technology > Services (0.71)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.55)

Using Group Membership Markers for Group Identification

Gawron, Jean Mark (San Diego State University) | Gupta, Dipak (San Diego State University) | Stephens, Kellen (San Diego State University) | Tsou, Ming-Hsiang (San Diego State University) | Spitzberg, Brian (San Diego State University) | An, Li (San Diego State University)

We describe a system for automatically ranking documents by degree of militancy, designed as a tool both for finding militant websites and prioritizing the data found. We compare three ranking systems, one employing a small hand-selected vocabulary based on group membership markers used by insiders to identify members and member properties (us) and outsiders and threats (them), one with a much larger vocabulary, and another with a small vocabulary chosen by Mutual Information. We use the same vocabularies to build classifiers. The ranker that achieves the best correlations with human judgments uses the small us-them vocabulary. We confirm and extend recent results in sentiment analysis (paltoglou 2010), showing that a feature-weighting scheme taken from classical IR (TFIDF) produces the best ranking system; we also find, surprisingly, that adjusting these weights with SVM training, while producing a better classifier, produces a worse ranker. Increasing vocabulary size similarly improves classification (while worsening ranking).

artificial intelligence, classifier, text processing, (18 more...)

Sixth International AAAI Conference on Weblogs and Social Media

Country: North America > United States (0.47)

Industry:

Health & Medicine (1.00)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)

A Sentiment-Aware Approach to Community Formation in Social Media

Nguyen, Thin (Deakin University) | Phung, Dinh (Deakin University) | Adams, Brett (Curtin University) | Venkatesh, Svetha (Deakin University)

Participating in a community exemplifies the aspect of sharing, networking and interacting in a social media system. There has been extensive work on characterising on-line communities by their contents and tags using topic modelling tools. However, the role of sentiment and mood has not been studied. Arguably, mood is an integral feature of a text, and becomes more significant in the context of social media: two communities might discuss precisely the same topics, yet within an entirely different atmosphere. Such sentiment-related distinctions are important for many kinds of analysis and applications, such as community recommendation. We present a novel approach to identification of latent hyper-groups in social communities based on users’ sentiment. The results show that a sentiment-based approach can yield useful insights into community formation and meta-communities, having potential applications in, for example, mental health—by targeting support or surveillance to communities with negative mood—or in marketing—by targeting customer communities having the same sentiment on similar topics.

health & medicine, representation, social media, (19 more...)

Sixth International AAAI Conference on Weblogs and Social Media

Country: Oceania > Australia (0.29)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Mixed Membership Models for Exploring User Roles in Online Fora

White, Arthur J. (University College Dublin) | Chan, Jeffrey (University of Melbourne) | Hayes, Conor (National University Ireland Galway) | Murphy, Brendan (University College Dublin)

Discussion boards are a form of social media which allow users to discuss topics and exchange information in a complex manner, in a number of different settings. As the popularity of such message boards has increased, communities of users have emerged, and several prominent types of social role have been identified, such as Question Answerer, Celebrity, Discussion Person and Topic Initiator. Recent studies have noted the structural similarity of the egocentric network of users assigned the same role by qualitative criteria. In this paper a methodology is developed with which to cluster together users with similar ego-centric network structures. This is achieved using a mixed membership formulation which allows for the fact that different groups of users may have characteristics in common. The method is then applied to data taken from boards.ie, a medium sized message boards website. Prominent clusters of users are identified and discussed, and illustrative examples of user behaviour provided. The type of interaction, both locally and globally, taking place within forums is examined.

artificial intelligence, social media, social role, (16 more...)

Sixth International AAAI Conference on Weblogs and Social Media

Country: Europe > Ireland (0.29)

Technology:

Information Technology > Communications > Collaboration (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.69)
Information Technology > Communications > Social Media (0.69)

Evolution of Experts in Question Answering Communities

Pal, Aditya (University of Minnesota) | Chang, Shuo (University of Minnesota) | Konstan, Joseph A. (University of Minnesota)

Community Question Answering (CQA) services thrive as a result of a small number of highly active users, typically called experts, who provide a large number of high quality useful answers. Understanding the temporal dynamics and interactions between experts can present key insights into how community members evolve over time. In this paper, we present a temporal study of experts in CQA and analyze the changes in their behavioral patterns over time. Further, using unsupervised machine learning methods, we show the interesting evolution patterns that can help us distinguish experts from one another. Using supervised classification methods, we show that the models based on evolutionary data of users can be more effective at expert identification than the models that ignore evolution. We run our experiments on two large online CQA to show the generality of our proposed approach.

artificial intelligence, natural language, ordinary user, (20 more...)

Sixth International AAAI Conference on Weblogs and Social Media

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.71)

Modeling Spread of Disease from Social Interactions

Sadilek, Adam (University of Rochester) | Kautz, Henry (University of Rochester) | Silenzio, Vincent (University of Rochester)

Research in computational epidemiology to date has concentrated on coarse-grained statistical analysis of populations, often synthetic ones. By contrast, this paper focuses on fine-grained modeling of the spread of infectious diseases throughout a large real-world social network. Specifically, we study the roles that social ties and interactions between specific individuals play in the progress of a contagion. We focus on public Twitter data, where we find that for every health-related message there are more than 1,000 unrelated ones. This class imbalance makes classification particularly challenging. Nonetheless, we present a framework that accurately identifies sick individuals from the content of online communication. Evaluation on a sample of 2.5 million geo-tagged Twitter messages shows that social ties to infected, symptomatic people, as well as the intensity of recent co-location, sharply increase one's likelihood of contracting the illness in the near future. To our knowledge, this work is the first to model the interplay of social activity, human mobility, and the spread of infectious disease in a large real-world population. Furthermore, we provide the first quantifiable estimates of the characteristics of disease transmission on a large scale without active user participation---a step towards our ability to model and predict the emergence of global epidemics from day-to-day interpersonal interactions.

immunology, social media, tweet, (21 more...)

Sixth International AAAI Conference on Weblogs and Social Media

Country: North America > United States > New York (0.15)

Genre: Research Report (0.93)

Industry:

Information Technology > Services (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.94)

Riemer, Dominik (FZI Research Center for Information Technologies) | Stojanovic, Ljiljana (FZI Research Center for Information Technologies) | Stojanovic, Nenad (FZI Research Center for Information Technologies)

Using Complex Event Processing for Modeling Semantic Requests in Real-Time Social Media Monitoring

Social media analytics has been attracting considerable attention in both research and industry due to the increasing popularity of social media usage. As a subset, social media monitoring describes the process of continuous monitoring of a subject matter in social media. From our point of view, the key requirements for such systems are i) high throughput and real-time processing of incoming data, ii) a user-friendly way to define complex situations of interests that make use of formalized background knowledge and iii) capabilities to perform actions based on gained insights instead of a pure monitoring system. In this paper, we propose a system for (pro) active, real-time social media monitoring. Firstly, we describe the conceptual architecture of our system and necessary pre-processing steps. Secondly, we introduce our concept of semantic requests that is capable to extend event pattern definitions with background knowledge. Finally, we show the usefulness of this system in two different domains: Real-time political opinion tracking and proactive establishment of relationships with consumers in order to perform a new form of real-time marketing. The main advantage of our approach is a simplified, expressive way to formulate event patterns in social media applications.

semantic request, social media, text processing, (20 more...)

Sixth International AAAI Conference on Weblogs and Social Media

Country:

North America > United States (0.46)
Europe > Germany (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Facebook and Privacy: The Balancing Act of Personality, Gender, and Relationship Currency

Quercia, Daniele (University of Cambridge) | Casas, Diego Las (Universidade Federal de Minas Gerais) | Pesce, Joao Paulo (Universidade Federal de Minas Gerais) | Stillwell, David (University of Cambridge) | Kosinski, Michal (University of Cambridge) | Almeida, Virgilio (Universidade Federal de Minas Gerais) | Crowcroft, Jon (University of Cambridge)

Social media profiles are telling examples of the everyday need for disclosure and concealment. The balance between concealment and disclosure varies across individuals, and personality traits might partly explain this variability. Experimental findings on the relationship between information disclosure and personality have been so far inconsistent. We thus study this relationship anew with 1,313 Facebook users in the United States using two personality tests: the big five personality test and the self-monitoring test. We model the process of information disclosure in a principled way using Item Response Theory and correlate the resulting user disclosure scores with personality traits. We find a correlation with the trait of Openness and observe gender effects, in that, men and women share equal amount of private information, but men tend to make it more publicly available, well beyond their social circles. Interestingly, geographic (e.g., residence, hometown) and work-related information is used as relationship currency, in that, it is selectively shared with social contacts and is rarely shared with the Facebook community at large.

artificial intelligence, information, social media, (19 more...)

Sixth International AAAI Conference on Weblogs and Social Media

Country:

North America > United States (0.35)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology > Services (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)