Goto

Collaborating Authors

 Information Technology


Where Online Friends Meet: Social Communities in Location-Based Networks

AAAI Conferences

Recent research suggests that, as in offline scenarios, spatial proximity increases the likelihood that two individuals establish an online social connection, and geographic closeness could therefore influence the formation of online communities. In this work we present a study of communities in two online social networks with location-sharing features and analyze their social and spatial properties. We study the places users visit to understand whether communities revolve around places or whether they exist independently. Our results suggest that community structure in social networks may arise from both social and spatial factors, so that exploiting information about the places where people go could benefit the definition of new community detection methods and community evolution models.


Learning the Nature of Information in Social Networks

AAAI Conferences

We postulate that the nature of information items plays a vital role in the observed spread of these items in a social network. We capture this intuition by proposing a model that assigns to every information item two parameters: endogeneity and exogeneity. The endogeneity of the item quantifies its tendency to spread primarily through the connections between nodes; the exogeneity quantifies its tendency to be acquired by the nodes, independently of the underlying network. We also extend this item-based model to take into account the openness of each node to new information. We quantify openness by introducing the receptivity of a node. Given a social network and data related to the ordering of adoption of information items by nodes, we develop a maximum-likelihood framework for estimating endogeneity, exogeneity and receptivity parameters. We apply our methodology to synthetic and real data and demonstrate its efficacy as a data-analytic tool.


Unsupervised Real-Time Company Name Disambiguation in Twitter

AAAI Conferences

This paper presents a new approach to disambiguate company names in the Twitter social network. We have focused on making lighter the processing of comparing company profiles with tweets in order to obtain a competitive real-time system. With this aim, we only use the home page of each company as information source to create a unique profile. On the other hand, we compute the similarity of a tweet in connection to a profile by comparing the content of the tweet with the profile. Both steps do not use any other external information source and all the process is developed in an unsupervised way. We have tested our application with the test WePS-3 CLEF ORM corpus obtaining encouraging results.


So.cl: An Interest Network for Informal Learning

AAAI Conferences

Web search engines emerged prior to the dominance of social media. What if we imagined search as integrating with social media from the ground up? So.cl is a web application that combines web browsing, search, and social networking for the purposes of sharing and learning around topics of interest. In this paper, we present the results of a deployment study examining existing learning practices around search and social networking for students, and how these practices shifted when participants adopted So.cl. We found prior to using So.cl that students already heavily employed search tools and social media for learning. With the use of So.cl, we found that users engaged in lightweight, fun social sharing and learning for informal, personal topics, but not for more heavyweight collaboration around school or work. The public nature of So.cl encouraged users to post search results as much for self-expression as for searching, enabling serendipitous discovery around interests.


Homophily and Latent Attribute Inference: Inferring Latent Attributes of Twitter Users from Neighbors

AAAI Conferences

In this paper, we extend existing work on latent attribute inference by leveraging the principle of homophily: we evaluate the inference accuracy gained by augmenting the user features with features derived from the Twitter profiles and postings of her friends. We consider three attributes which have varying degrees of assortativity: gender, age, and political affiliation. Our approach yields a significant and robust increase in accuracy for both age and political affiliation, indicating that our approach boosts performance for attributes with moderate to high assortativity. Furthermore, different neighborhood subsets yielded optimal performance for different attributes, suggesting that different subsamples of the user's neighborhood characterize different aspects of the user herself. Finally, inferences using only the features of a user's neighbors outperformed those based on the user's features alone. This suggests that the neighborhood context alone carries substantial information about the user.


You Too?! Mixed-Initiative LDA Story Matching to Help Teens in Distress

AAAI Conferences

Adolescent cyber-bullying on social networks is a phenomenon that has received widespread attention. Recent work by sociologists has examined this phenomenon under the larger context of teenage drama and it's manifestations on social networks. Tackling cyber-bullying involves two key components – automatic detection of possible cases, and interaction strategies that encourage reflection and emotional support. Key is showing distressed teenagers that they are not alone in their plight. Conventional topic spotting and document classification into labels like "dating" or "sports" are not enough to effectively match stories for this task. In this work, we examine a corpus of 5500 stories from distressed teenagers from a major youth social network. We combine Latent Dirichlet Allocation and human interpretation of its output using principles from sociolinguistics to extract high-level themes in the stories and use them to match new stories to similar ones. A user evaluation of the story matching shows that theme-based retrieval does a better job of finding relevant and effective stories for this application than conventional approaches.


Finding Influential Authors in Brand-Page Communities

AAAI Conferences

Enterprises are increasingly using social media forums to engage with their customer online- a phenomenon known as Social Customer Relation Management (Social CRM) . In this context, it is important for an enterprise to identify “influential authors” and engage with them on a priority basis. We present a study towards finding influential authors on Twitter forums where an implicit network based on user interactions is created and analyzed. Furthermore, author profile features and user interaction features are combined in a decision tree classification model for finding influential authors. A novel objective evaluation criterion is used for evaluating various features and modeling techniques. We compare our methods with other approaches that use either only the formal connections or only the author profile features and show a significant improvement in the classification accuracy over these baselines as well as over using Klout score.


Automatic Versus Human Navigation in Information Networks

AAAI Conferences

People regularly face tasks that can be understood as navigation in information networks, where the goal is to find a path between two given nodes. In many such situations, the navigator only gets local access to the node currently under inspection and its immediate neighbors. This lack of global information about the network notwithstanding, humans tend to be good at finding short paths, despite the fact that real-world networks are typically very large. One potential reason for this could be that humans possess vast amounts of background knowledge about the world, which they leverage to make good guesses about possible solutions. In this paper we ask the question: Are human-like high-level reasoning skills really necessary for finding short paths? To answer this question, we design a number of navigation agents without such skills, which use only simple numerical features. We evaluate the agents on the task of navigating Wikipedia, a domain for which we also possess large-scale human navigation data. We observe that the agents find shorter paths than humans on average and therefore conclude that, perhaps surprisingly, no sophisticated background knowledge or high-level reasoning is required for navigating the complex Wikipedia network.


Towards Analyzing Micro-Blogs for Detection and Classification of Real-Time Intentions

AAAI Conferences

Micro-blog forums, such as Twitter, constitute a powerful medium today that people use to express their thoughts and intentions on a daily, and in many cases, hourly, basis. Extracting ‘Real-Time Intention’ (RTI) of a user from such short text updates is a huge opportunity towards web personalization and social net- working around dynamic user context. In this paper, we explore the novel problem of detecting and classifying RTIs from micro-blogs. We find that employing a heuristic based ensemble approach on a reduced dimension of the feature space, based on a wide spectrum of linguistic and statistical features of RTI expressions, achieves significant improvement in detect- ing RTIs compared to word-level features used in many social media classification tasks today. Our solution approach takes into account various salient characteristics of micro-blogs towards such classification – high dimensionality, sparseness of data, limited context, grammatical in-correctness, etc.


Emotional Divergence Influences Information Spreading in Twitter

AAAI Conferences

We analyze data about the micro-blogging site Twitter using sentiment extraction techniques. From an information perspective, Twitter users are involved mostly in two processes: information creation and subsequent distribution (tweeting), and pure information distribution (retweeting), with pronounced preference to the first. However a rather substantial fraction of tweets are retweeted. Here, we address the role of the sentiment expressed in tweets for their potential aftermath. We find that although the overall sentiment (polarity) does not influence the probability of a tweet to be retweeted, a new measure called "emotional divergence" does have an impact. In general, tweets with high emotional diversity have a better chance of being retweeted, hence influencing the distribution of information.