AITopics

Platforms such as Twitter have provided researchers with ample opportunities to analytically study social phenomena. There are however, significant computational challenges due to the enormous rate of production of new information: researchers are therefore, often forced to analyze a judiciously selected “sample” of the data. Like other social media phenomena, information diffusion is a social process–it is affected by user context, and topic, in addition to the graph topology. This paper studies the impact of different attribute and topology based sampling strategies on the discovery of an important social media phenomena–information diffusion. We examine several widely-adopted sampling methods that select nodes based on attribute (random, location, and activity) and topology (forest fire) as well as study the impact of attribute based seed selection on topology based sampling. Then we develop a series of metrics for evaluating the quality of the sample, based on user activity (e.g. volume, number of seeds), topological (e.g. reach, spread) and temporal characteristics (e.g. rate). We additionally correlate the diffusion volume metric with two external variables–search and news trends. Our experiments reveal that for small sample sizes (30%), a sample that incorporates both topology and user context (e.g. location, activity) can improve on naive methods by a significant margin of ~15-20%.

diffusion, information, twitter, (16 more...)

Fourth International AAAI Conference on Weblogs and Social Media

Country:

North America > United States > New York > New York County > New York City (0.05)
Europe (0.04)
Asia > Afghanistan (0.04)
(7 more...)

Genre: Research Report > Experimental Study (0.66)

Industry:

Information Technology > Services (1.00)
Health & Medicine (0.94)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Lanagan, James (Clarity: Centre For Sensor Web Technologies) | Ferguson, Paul (Clarity: Centre For Sensor Web Technologies) | O' (Clarity: Centre For Sensor Web Technologies) | Hare, Neil (Clarity: Centre For Sensor Web Technologies) | Smeaton, Alan F

Coping With Noise in a Real-World Weblog Crawler and Retrieval System

In this paper we examine the effects of noise when creating a real-world weblog corpus for information retrieval. We focus on the DiffPost (Lee et al. 2008) approach to noise removal from blog pages, examining the difficulties encountered when crawling the blogosphere during the creation of a real-world corpus of blog pages. We introduce and evaluate a number of enhancements to the original DiffPost approach in order to increase the robustness of the algorithm. We then extend DiffPost by looking at the anchor-text to text ratio, and discover that the time-interval between crawls is more important to the successful application of noise-removal algorithms within the blog context, than any additional improvements to the removal algorithm itself.

diffpost algorithm, noise, webpage, (13 more...)

Fourth International AAAI Conference on Weblogs and Social Media

Country:

North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > New York (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(3 more...)

Genre: Research Report (0.69)

Industry: Media > News (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Communications > Social Media (0.49)
Information Technology > Data Science > Data Mining (0.47)

What’s Worthy of Comment? Content and Comment Volume in Political Blogs

Yano, Tae (Carnegie Mellon University) | Smith, Noah A. (Carnegie Mellon University)

In research on blog data, comments are often ignored, What makes a blog post noteworthy? One measure of the and it is easy to see why: comments are very noisy, full popularity or breadth of interest of a blog post is the extent of nonstandard grammar and spelling, usually unedited, often to which readers of the blog are inspired to leave comments cryptic and uninformative, at least to those outside the on the post. In this paper, we study the relationship between blog's community. A few studies have focused on information the text contents of a blog post and the volume of response in comments. Mishe and Glance (2006) showed the it will receive from blog readers. Modeling this relationship value of comments in characterizing the social repercussions has the potential to reveal the interests of a blog's readership of a post, including popularity and controversy. Their largescale community to its authors, readers, advertisers, and scientists user study correlated popularity and comment activity.

artificial intelligence, machine learning, natural language, (19 more...)

Fourth International AAAI Conference on Weblogs and Social Media

Country:

Asia > Middle East > Iraq (0.05)
Asia > Middle East > Jordan (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(6 more...)

Industry:

Government > Voting & Elections (0.46)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)
(2 more...)

Why do Users Tag? Detecting Users’ Motivation for Tagging in Social Tagging Systems

Strohmaier, Markus (Graz University of Technology and Know-Center) | Körner, Christian (Graz University of Technology) | Kern, Roman (Know-Center)

While recent progress has been achieved in understanding the structure and dynamics of social tagging systems, we know little about the underlying user motivations for tagging, and how they influence resulting folksonomies and tags. This paper addresses three issues related to this question: 1.) What motivates users to tag resources, and in what ways is user motivation amenable to quantitative analysis? 2.) Does users' motivation for tagging vary within and across social tagging systems, and if so how? and 3.) How does variability in user motivation influence resulting tags and folksonomies? In this paper, we present measures to detect whether a tagger is primarily motivated by categorizing or describing resources, and apply the measures to datasets from 8 different tagging systems. Our results show that a) users' motivation for tagging varies not only across, but also within tagging systems, and that b) tag agreement among users who are motivated by categorizing resources is significantly lower than among users who are motivated by describing resources. Our findings are relevant for (i) the development of tag recommenders, (ii) the analysis of tag semantics and (iii) the design of search algorithms for social tagging systems.

artificial intelligence, motivation, social media, (18 more...)

Fourth International AAAI Conference on Weblogs and Social Media

Country:

Europe > Austria > Styria > Graz (0.06)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > North Carolina > Wake County > Raleigh (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Raaijmakers, Stephan (TNO ICT, Delft, The Netherlands) | Kraaij, Wessel (TNO ICT, Delft, The Netherlands)

Classifier Calibration for Multi-Domain Sentiment Classification

Textual sentiment classifiers classify texts into a fixed number of affective classes, such as positive, negative or neutral sentiment, or subjective versus objective information. It has been observed that sentiment classifiers suffer from a lack of generalization capability: a classifier trained on a certain domain generally performs worse on data from another domain. This phenomenon has been attributed to domain-specific affective vocabulary. In this paper, we propose a voting-based thresholding approach, which calibrates a number of existing single-domain classifiers with respect to sentiment data from a new domain. The approach presupposes only a small amount of annotated data from the new domain. We evaluate three criteria for estimating thresholds, and discuss the ramifications of these criteria for the trade-off between classifier performance and manual annotation effort.

artificial intelligence, classifier, natural language, (16 more...)

Fourth International AAAI Conference on Weblogs and Social Media

Country:

North America > United States > Ohio > Franklin County > Columbus (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Netherlands > South Holland > Delft (0.04)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Socio-Legal Analysis of Criminal Sentences: A Preliminary Study

Giura, Giuseppe (University of Catani) | Giuffrida, Giovanni (University of Catani) | Pennisi, Carlo (University of Catani) | Zarba, Calogero (Neodata Intelligence)

This paper discusses a research based on analyzing criminal sentences on criminal trials on organized crime activity in Sicily pronounced from 2000 through 2006. Large criminal sentences related dataset collection activity in Italy is severely constrained for various reasons such as difficulty of data collection at the courthouses, unavailability of data in digital format, and classification criteria used in the public archives. Thus, in general, judicial statistics suffer from lack of reliability and informativeness. The objective of this research is to analyze the text of criminal sentences in a revisable and verifiable way, so that information is extracted on the trial leading to the sentence, the socio-economic environment in which the relevant events occurred, and the differences between the various districts conducting the trials. The purpose is to elaborate a tool of automated analysis of the text of the sentences that is generalizable to other areas of jurisprudence, and, outside of jurisprudence, to other temporal and geographical contexts. The 726 criminal sentences that have been converted into text files have been pronounced at all judicial levels in the four Sicilian districts for mafia-related crimes. This research is relevant because, for the first time in Italy, we aim to empirically describe the juridical response to the phenomenon of organized crime, by using a large and extendable database of criminal sentences that can be analyzed with data mining techniques, rather than deriving general conclusions from a focused small set of sentences.

criminal sentence, data mining, natural language, (17 more...)

Fourth International AAAI Conference on Weblogs and Social Media

Country:

Europe > Italy > Sicily (0.25)
North America > United States > New York (0.04)

Industry:

Law > Criminal Law (0.66)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.55)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.52)

Ciampaglia, Giovanni Luca (Università della Svizzera Italiana) | Vancheri, Alberto (Università della Svizzera Italiana)

Empirical Analysis of User Participation in Online Communities: the Case of Wikipedia

We study the distribution of the activity period of users in five of the largest localized versions of the free, on- line encyclopedia Wikipedia. We find it to be consis- tent with a mixture of two truncated log-normal distri- butions. Using this model, the temporal evolution of these systems can be analyzed, showing that the statis- tical description is consistent over time.

artificial intelligence, machine learning, social media, (17 more...)

Fourth International AAAI Conference on Weblogs and Social Media

Country:

Europe > Switzerland (0.05)
North America > United States > New York (0.04)

Genre: Research Report (0.48)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Biel, Joan-Isaac (Idiap Research Institute) | Gatica-Perez, Daniel (Idiap Research Institute)

Voices of Vlogging

Vlogs have rapidly evolved from the ’chat from your bedroom’ format to a highly creative form of expression and communication. However, despite the high popularity of vlogging, automatic analysis of conversational vlogs have not been attempted in the literature. In this paper, we present a novel analysis of conversational vlogs based on the characterization of vloggers’ nonverbal behavior. We investigate the use of four nonverbal cues extracted automatically from the audio channel to measure the behavior of vloggers and explore the relation to their degree of popularity and that of their videos. Our study is validated on over 2200 videos and 150 hours of data, and shows that one nonverbal cue (speaking time) is correlated with levels of popularity with a medium size effect.

artificial intelligence, social media, video, (17 more...)

Fourth International AAAI Conference on Weblogs and Social Media

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)

Technology:

Information Technology > Communications > Social Media (0.77)
Information Technology > Artificial Intelligence (0.68)

To Be a Star Is Not Only Metaphoric: From Popularity to Social Linkage

Stoica, Alina Mihaela (Orange Labs and LIAFA, University Paris 7) | Couronne, Thomas (Orange Labs) | Beuscart, Jean - Samuel (Orange Labs)

The emergence of online platforms allowing to mix self publishing activities and social networking offers new possibilities for building online reputation and visibility. In this paper we present a method to analyze the online popularity that takes into consideration both the success of the published content and the social network topology. First, we adapt the Kohonen self organizing maps in order to cluster the users of online platforms depending on their audience and authority characteristics. Then, we perform a detailed analysis of the manner nodes are organized in the social network. Finally, we study the relationship between the network local structure around each node and the corresponding user’s popularity. We apply this method to the MySpace music social network. We observe that the most popular artists are centers of star shaped social structures and that it exists a fraction of artists who are involved in community and social activity dynamics independently of their popularity. This method based on a learning algorithm and on network analysis appears to be a robust and intuitive technique for a rich description of the online behavior.

artificial intelligence, machine learning, vertex, (20 more...)

Fourth International AAAI Conference on Weblogs and Social Media

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
North America > United States > New York > New York County > New York City (0.04)

Industry: Information Technology > Services (0.77)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Girju, Roxana (University of Illinois)

Toward Social Causality: An Analysis of Interpersonal Relationships in Online Blogs and Forums

In this paper we present encouraging preliminary results into the problem of social causality (causal reasoning used by intelligent agents in a social environment) in online social interactions based on a model of reciprocity. At every level, social relationships are guided by the shared understanding that most actions call for appropriate reactions, and that inappropriate reactions require management. Thus, we present an analysis of interpersonal relationships in English reciprocal contexts. Specifically, we rely here on a large and recently built database of 10,882 reciprocal relation instances in online media. The resource is analyzed along a set of novel and important dimensions: symmetry, affective value, gender}, and {\em intentionality of action which are highly interconnected. At a larger level, we automatically generate {\em chains of causal relations} between verbs indicating interpersonal relationships. Statistics along these dimensions give insights into people's behavior, judgments, and thus their social interactions.

artificial intelligence, machine learning, natural language, (19 more...)

Fourth International AAAI Conference on Weblogs and Social Media

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > New York (0.04)
North America > United States > Oregon (0.04)
(4 more...)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)