Goto

Collaborating Authors

 Learning Graphical Models


What's in Your Tweets? I Know Who You Supported in the UK 2010 General Election

AAAI Conferences

Nowadays, the use of social media such as Twitter is necessary to monitor trends of people on political issues. As a case study, we collected the main stream of Twitter related to the 2010 UK general election during the associated period. We analyse the characteristics of the three main parties in the election. Also, we propose a simple and practical algorithm to identify the political leaning of users using the amount of Twitter messages which seem related to political parties. The experimental results showed that the best-performing classification method -- which uses the number of Twitter messages referring to a particular political party -- achieved about 86% classification accuracy without any training phase.


Catching the Long-Tail: Extracting Local News Events from Twitter

AAAI Conferences

Twitter, used in 200 countries with over 250 milliontweets a day, is a rich source of local news from aroundthe world. Many events of local importance are first reportedon Twitter, including many that never reach newschannels. Further, there are often only a few tweetsreporting each such event, in contrast with the largervolumes that follow events of wider significance. Eventhough such events may be primarily of local importance,they can also be of critical interest to some specificbut possibly far flung entities: For example, a firein a supplier’s factory half-way around the world maybe of interest even from afar. In this paper we describehow this ‘long tail’ of events can be detected in spite oftheir sparsity.We then extract and correlate informationfrom multiple tweets describing the same event. Ourgeneric architecture for converting a tweet-stream intoevent-objects uses locality sensitive hashing, classification,boosting, information extraction and clustering.Our results, based on millions of tweets monitored overmany months, appear to validate our approach and architecture:We achieved success-rates in the 80% rangefor event detection and 76% on event-correlation; we also reduced tweet-comparisons by 80% using LSH.


Facebook and Privacy: The Balancing Act of Personality, Gender, and Relationship Currency

AAAI Conferences

Social media profiles are telling examples of the everyday need for disclosure and concealment. The balance between concealment and disclosure varies across individuals, and personality traits might partly explain this variability. Experimental findings on the relationship between information disclosure and personality have been so far inconsistent. We thus study this relationship anew with 1,313 Facebook users in the United States using two personality tests: the big five personality test and the self-monitoring test. We model the process of information disclosure in a principled way using Item Response Theory and correlate the resulting user disclosure scores with personality traits. We find a correlation with the trait of Openness and observe gender effects, in that, men and women share equal amount of private information, but men tend to make it more publicly available, well beyond their social circles. Interestingly, geographic (e.g., residence, hometown) and work-related information is used as relationship currency, in that, it is selectively shared with social contacts and is rarely shared with the Facebook community at large.


Exploring Social-Historical Ties on Location-Based Social Networks

AAAI Conferences

Location-based social networks (LBSNs) have become a popular form of social media in recent years. They provide location related services that allow users to "check-in'' at geographical locations and share such experiences with their friends. Millions of "check-in'' records in LBSNs contain rich information of social and geographical context and provide a unique opportunity for researchers to study user's social behavior from a spatial-temporal aspect, which in turn enables a variety of services including place advertisement, traffic forecasting, and disaster relief. In this paper, we propose a social-historical model to explore user's check-in behavior on LBSNs. Our model integrates the social and historical effects and assesses the role of social correlation in user's check-in behavior. In particular, our model captures the property of user's check-in history in forms of power-law distribution and short-term effect, and helps in explaining user's check-in behavior. The experimental results on a real world LBSN demonstrate that our approach properly models user's check-ins and shows how social and historical ties can help location prediction.


Comparing SVM and Naive Bayes classifiers for text categorization with Wikitology as knowledge enrichment

arXiv.org Artificial Intelligence

The activity of labeling of documents according to their content is known as text categorization. Many experiments have been carried out to enhance text categorization by adding background knowledge to the document using knowledge repositories like Word Net, Open Project Directory (OPD), Wikipedia and Wikitology. In our previous work, we have carried out intensive experiments by extracting knowledge from Wikitology and evaluating the experiment on Support Vector Machine with 10- fold cross-validations. The results clearly indicate Wikitology is far better than other knowledge bases. In this paper we are comparing Support Vector Machine (SVM) and Na\"ive Bayes (NB) classifiers under text enrichment through Wikitology. We validated results with 10-fold cross validation and shown that NB gives an improvement of +28.78%, on the other hand SVM gives an improvement of +6.36% when compared with baseline results. Na\"ive Bayes classifier is better choice when external enriching is used through any external knowledge base.


What Cannot be Learned with Bethe Approximations

arXiv.org Machine Learning

We address the problem of learning the parameters in graphical models when inference is intractable. A common strategy in this case is to replace the partition function with its Bethe approximation. We show that there exists a regime of empirical marginals where such Bethe learning will fail. By failure we mean that the empirical marginals cannot be recovered from the approximated maximum likelihood parameters (i.e., moment matching is not achieved). We provide several conditions on empirical marginals that yield outer and inner bounds on the set of Bethe learnable marginals. An interesting implication of our results is that there exists a large class of marginals that cannot be obtained as stable fixed points of belief propagation. Taken together our results provide a novel approach to analyzing learning with Bethe approximations and highlight when it can be expected to work or fail.


Bregman divergence as general framework to estimate unnormalized statistical models

arXiv.org Machine Learning

We show that the Bregman divergence provides a rich framework to estimate unnormalized statistical models for continuous or discrete random variables, that is, models which do not integrate or sum to one, respectively. We prove that recent estimation methods such as noise-contrastive estimation, ratio matching, and score matching belong to the proposed framework, and explain their interconnection based on supervised learning. Further, we discuss the role of boosting in unsupervised learning.


Kernel-based Conditional Independence Test and Application in Causal Discovery

arXiv.org Machine Learning

Conditional independence testing is an important problem, especially in Bayesian network learning and causal discovery. Due to the curse of dimensionality, testing for conditional independence of continuous variables is particularly challenging. We propose a Kernel-based Conditional Independence test (KCI-test), by constructing an appropriate test statistic and deriving its asymptotic distribution under the null hypothesis of conditional independence. The proposed method is computationally efficient and easy to implement. Experimental results show that it outperforms other methods, especially when the conditioning set is large or the sample size is not very large, in which case other methods encounter difficulties.


Hierarchical Maximum Margin Learning for Multi-Class Classification

arXiv.org Machine Learning

Due to myriads of classes, designing accurate and efficient classifiers becomes very challenging for multi-class classification. Recent research has shown that class structure learning can greatly facilitate multi-class learning. In this paper, we propose a novel method to learn the class structure for multi-class classification problems. The class structure is assumed to be a binary hierarchical tree. To learn such a tree, we propose a maximum separating margin method to determine the child nodes of any internal node. The proposed method ensures that two classgroups represented by any two sibling nodes are most separable. In the experiments, we evaluate the accuracy and efficiency of the proposed method over other multi-class classification methods on real world large-scale problems. The results show that the proposed method outperforms benchmark methods in terms of accuracy for most datasets and performs comparably with other class structure learning methods in terms of efficiency for all datasets.


Sparse matrix-variate Gaussian process blockmodels for network modeling

arXiv.org Machine Learning

We face network data from various sources, such as protein interactions and online social networks. A critical problem is to model network interactions and identify latent groups of network nodes. This problem is challenging due to many reasons. For example, the network nodes are interdependent instead of independent of each other, and the data are known to be very noisy (e.g., missing edges). To address these challenges, we propose a new relational model for network data, Sparse Matrix-variate Gaussian process Blockmodel (SMGB). Our model generalizes popular bilinear generative models and captures nonlinear network interactions using a matrix-variate Gaussian process with latent membership variables. We also assign sparse prior distributions on the latent membership variables to learn sparse group assignments for individual network nodes. To estimate the latent variables efficiently from data, we develop an efficient variational expectation maximization method. We compared our approaches with several state-of-the-art network models on both synthetic and real-world network datasets. Experimental results demonstrate SMGBs outperform the alternative approaches in terms of discovering latent classes or predicting unknown interactions.