Information Technology
Resolving Over-Constrained Probabilistic Temporal Problems through Chance Constraint Relaxation
Yu, Peng (Massachusetts Institute of Technology) | Fang, Cheng (Massachusetts Institute of Technology) | Williams, Brian (Massachusetts Institute of Technology)
When scheduling tasks for field-deployable systems, our solutions must be robust to the uncertainty inherent in the real world. Although human intuition is trusted to balance reward and risk, humans perform poorly in risk assessment at the scale and complexity of real world problems. In this paper, we present a decision aid system that helps human operators diagnose the source of risk and manage uncertainty in temporal problems. The core of the system is a conflict-directed relaxation algorithm, called Conflict-Directed Chance-constraint Relaxation (CDCR), which specializes in resolving over-constrained temporal problems with probabilistic durations and a chance constraint bounding the risk of failure. Given a temporal problem with uncertain duration, CDCR proposes execution strategies that operate at acceptable risk levels and pinpoints the source of risk. If no such strategy can be found that meets the chance constraint, it can help humans to repair the over-constrained problem by trading off between desirability of solution and acceptable risk levels. The decision aid has been incorporated in a mission advisory system for assisting oceanographers to schedule activities in deep-sea expeditions, and demonstrated its effectiveness in scenarios with realistic uncertainty.
Sub-Merge: Diving Down to the Attribute-Value Level in Statistical Schema Matching
Lim, Zhe (The University of Melbourne) | Rubinstein, Benjamin (The University of Melbourne)
Matching and merging data from conflicting sources is the bread and butter of data integration, which drives search verticals, e-commerce comparison sites and cyber intelligence. Schema matching lifts data integration - traditionally focused on well-structured data - to highly heterogeneous sources. While schema matching has enjoyed significant success in matching data attributes, inconsistencies can exist at a deeper level, making full integration difficult or impossible. We propose a more fine-grained approach that focuses on correspondences between the values of attributes across data sources. Since the semantics of attribute values derive from their use and co-occurrence, we argue for the suitability of canonical correlation analysis (CCA) and its variants. We demonstrate the superior statistical and computational performance of multiple sparse CCA compared to a suite of baseline algorithms, on two datasets which we are releasing to stimulate further research. Our crowd-annotated data covers both cases that are relatively easy for humans to supply ground-truth, and that are inherently difficult for human computation.
Using Frame Semantics for Knowledge Extraction from Twitter
Søgaard, Anders (University of Copenhagen) | Plank, Barbara (University of Copenhagen) | Alonso, Hector Martinez (University of Copenhagen)
Knowledge bases have the potential to advance artificial intelligence, but often suffer from recall problems, i.e., lack of knowledge of new entities and relations. On the contrary, social media such as Twitter provide abundance of data, in a timely manner: information spreads at an incredible pace and is posted long before it makes it into more commonly used resources for knowledge extraction. In this paper we address the question whether we can exploit social media to extract new facts, which may at first seem like finding needles in haystacks. We collect tweets about 60 entities in Freebase and compare four methods to extract binary relation candidates, based on syntactic and semantic parsing and simple mechanism for factuality scoring. The extracted facts are manually evaluated in terms of their correctness and relevance for search. We show that moving from bottom-up syntactic or semantic dependency parsing formalisms to top-down frame-semantic processing improves the robustness of knowledge extraction, producing more intelligible fact candidates of better quality. In order to evaluate the quality of frame semantic parsing on Twitter intrinsically, we make a multiply frame-annotated dataset of tweets publicly available.
Semantic Lexicon Induction from Twitter with Pattern Relatedness and Flexible Term Length
Qadir, Ashequl ( University of Utah ) | Mendes, Pablo N. (IBM Research) | Gruhl, Daniel (IBM Research) | Lewis, Neal (IBM Research)
With the rise of social media, learning from informal text has become increasingly important. We present a novel semantic lexicon induction approach that is able to learn new vocabulary from social media. Our method is robust to the idiosyncrasies of informal and open-domain text corpora. Unlike previous work, it does not impose restrictions on the lexical features of candidate terms — e.g. by restricting entries to nouns or noun phrases —while still being able to accurately learn multiword phrases of variable length. Starting with a few seed terms for a semantic category, our method first explores the context around seed terms in a corpus, and identifies context patterns that are relevant to the category. These patterns are used to extract candidate terms — i.e. multiword segments that are further analyzed to ensure meaningful term boundary segmentation. We show that our approach is able to learn high quality semantic lexicons from informally written social media text of Twitter, and can achieve accuracy as high as 92% in the top 100 learned category members.
Measuring Plan Diversity: Pathologies in Existing Approaches and A New Plan Distance Metric
Goldman, Robert P. (SIFT, LLC) | Kuter, Ugur (SIFT, LLC)
In this paper we present a plan-plan distance metric based on Kolmogorov(Algorithmic) complexity. Generating diverse sets of plans is useful for task ssuch as probing user preferences and reasoning about vulnerability to cyberattacks. Generating diverse plans, and comparing different diverse planning approaches requires a domain-independent, theoretically motivated definition of the diversity distance between plans. Previously proposed diversity measures are not theoretically motivated, and can provide inconsistent results on the sameplans. We define the diversity of plans in terms of how surprising one plan is givenanother or, its inverse, the conditional information in one plan givenanother. Kolmogorov complexity provides a domain independent theory of conditional information. While Kolmogorov complexity is not computable, a related metric, Normalized Compression Distance (NCD), provides a well-behaved approximation. In this paper we introduce NCD as an alternative diversity metric, and analyze its performance empirically, in comparison with previous diversity measures, showing strengths and weaknesses of each.We also examine the use of different compressor sin NCD. We show how NCD can be used to select a training set for HTN learning,giving an example of the utility of diversity metrics. We conclude withsuggestions for future work on improving, extending, and applying it to serve new applications.
A Family of Latent Variable Convex Relaxations for IBM Model 2
Simion, Andrei Arsene (Columbia University) | Collins, Michael (Columbia University) | Stein, Cliff (Columbia University)
Recently, a new convex formulation of IBM Model 2 was introduced. In this paper we develop the theory further and introduce a class of convex relaxations for latent variable models which include IBM Model 2. When applied to IBM Model 2, our relaxation class subsumes the previous relaxation as a special case. As proof of concept, we study a new relaxation of IBM Model 2 which is simpler than the previous algorithm: the new relaxation relies on the use of nothing more than a multinomial EM algorithm, does not require the tuning of a learning rate, and has some favorable comparisons to IBM Model 2 in terms of F-Measure. The ideas presented could be applied to a wide range of NLP and machine learning problems.
Mining User Interests from Personal Photos
Xie, Pengtao (Carnegie Mellon University) | Pei, Yulong (Carnegie Mellon University) | Xie, Yuan (Carnegie Mellon University) | Xing, Eric (Carnegie Mellon University)
Personal photos are enjoying explosive growth with the popularity of photo-taking devices and social media. The vast amount of online photos largely exhibit users' interests, emotion and opinions. Mining user interests from personal photos can boost a number of utilities, such as advertising, interest based community detection and photo recommendation. In this paper, we study the problem of user interests mining from personal photos. We propose a User Image Latent Space Model to jointly model user interests and image contents. User interests are modeled as latent factors and each user is assumed to have a distribution over them. By inferring the latent factors and users' distributions, we can discover what the users are interested in. We model image contents with a four-level hierarchical structure where the layers correspond to themes, semantic regions, visual words and pixels respectively. Users' latent interests are embedded in the theme layer. Given image contents, users' interests can be discovered by doing posterior inference. We use variational inference to approximate the posteriors of latent variables and learn model parameters. Experiments on 180K Flickr photos demonstrate the effectiveness of our model.
Detecting Change Points in the Large-Scale Structure of Evolving Networks
Peel, Leto (University of Colorado at Boulder) | Clauset, Aaron (University of Colorado at Boulder)
Interactions among people or objects are often dynamic in nature and can be represented as a sequence of networks, each providing a snapshot of the interactions over a brief period of time. An important task in analyzing such evolving networks is change-point detection, in which we both identify the times at which the large-scale pattern of interactions changes fundamentally and quantify how large and what kind of change occurred. Here, we formalize for the first time the network change-point detection problem within an online probabilistic learning framework and introduce a method that can reliably solve it. This method combines a generalized hierarchical random graph model with a Bayesian hypothesis test to quantitatively determine if, when, and precisely how a change point has occurred. We analyze the detectability of our method using synthetic data with known change points of different types and magnitudes, and show that this method is more accurate than several previously used alternatives. Applied to two high-resolution evolving social networks, this method identifies a sequence of change points that align with known external ``shocks'' to these networks.
Inferring Latent User Properties from Texts Published in Social Media
Volkova, Svitlana (Johns Hopkins University) | Bachrach, Yoram (Microsoft Research) | Armstrong, Michael ( Microsoft Research ) | Sharma, Vijay ( Microsoft Research )
We demonstrate an approach to predict latent personal attributes including user demographics, online personality, emotions and sentiments from texts published on Twitter. We rely on machine learning and natural language processing techniques to learn models from user communications. We first examine individual tweets to detect emotions and opinions emanating from them, and then analyze all the tweets published by a user to infer latent traits of that individual. We consider various user properties including age, gender, income, education, relationship status, optimism and life satisfaction. We focus on Ekman’s six emotions: anger, joy, surprise, fear, disgust and sadness. Our work can help social network users to understand how others may perceive them based on how they communicate in social media, in addition to its evident applications in online sales and marketing, targeted advertising, large scale polling and healthcare analytics.
Swiss-System Based Cascade Ranking for Gait-Based Person Re-Identification
Wei, Lan (Peking University) | Tian, Yonghong (Peking University) | Wang, Yaowei (Beijing Institute of Technology) | Huang, Tiejun (Peking University)
Human gait has been shown to be an efficient biometric measure for person identification at a distance. However, it often needs different gait features to handle various covariate conditions including viewing angles, walking speed, carrying an object and wearing different types of shoes. In order to improve the robustness of gait-based person re-identification on such multi-covariate conditions, a novel Swiss-system based cascade ranking model is proposed in this paper. Since the ranking model is able to learn a subspace where the potential true match is given the highest ranking, we formulate the gait-based person re-identification as a bipartite ranking problem and utilize it as an effective way for multi-feature ensemble learning. Then a Swiss multi-round competition system is developed for the cascade ranking model to optimize its effectiveness and efficiency. Extensive experiments on three indoor and outdoor public datasets demonstrate that our model outperforms several state-of-the-art methods remarkably.