Genre
Counting-MLNs: Learning Relational Structure for Decision Making
Nath, Aniruddh (University of Washington) | Richardson, Matthew (Microsoft Research)
Many first-order probabilistic models can be represented much more compactly using aggregation operations such as counting. While traditional statistical relational representations share factors across sets of interchangeable random variables, representations that explicitly model aggregations also exploit interchangeability of random variables within factors. This is especially useful in decision making settings, where an agent might need to reason about counts of the different types of objects it interacts with. Previous work on counting formulas in statistical relational representations has mostly focused on the problem of exact inference on an existing model. The problem of learning such models is largely unexplored. In this paper, we introduce Counting Markov Logic Networks (C-MLNs), an extension of Markov logic networks that can compactly represent complex counting formulas. We present a structure learning algorithm for C-MLNs; we apply this algorithm to the novel problem of generalizing natural language instructions, and to relational reinforcement learning in the Crossblock domain, in which standard MLN learning algorithms fail to find any useful structure. The C-MLN policies learned from natural language instructions are compact and intuitive, and, despite requiring no instructions on test games, win 20% more Crossblock games than a state-of-the-art algorithm for following natural language instructions.
Margin-Based Feature Selection in Incomplete Data
Lou, Qiang (Temple University) | Obradovic, Zoran (Temple University)
This study considers the problem of feature selection in incomplete data. The intuitive approach is to first impute the missing values, and then apply a standard feature selection method to select relevant features. In this study, we show how to perform feature selection directly, without imputing missing values. We define the objective function of the uncertainty margin-based feature selection method to maximize each instanceโs uncertainty margin in its own relevant subspace. In optimization, we take into account the uncertainty of each instance due to the missing values. The experimental results on synthetic and 6 benchmark data sets with few missing values (less than 25%) provide evidence that our method can select the same accurate features as the alternative methods which apply an imputation method first. However, when there is a large fraction of missing values (more than 25%) in data, our feature selection method outperforms the alternatives, which impute missing values first.
Random Projection with Filtering for Nearly Duplicate Search
Lin, Yue (Zhejiang University) | Jin, Rong (Michigan State University) | Cai, Deng (Zhejiang University) | He, Xiaofei (Zhejiang University)
High dimensional nearest neighbor search is a fundamental problem and has found applications in many domains. Although many hashing based approaches have been proposed for approximate nearest neighbor search in high dimensional space, one main drawback is that they often return many false positives that need to be filtered out by a post procedure. We propose a novel method to address this limitation in this paper. The key idea is to introduce a filtering procedure within the search algorithm, based on the compressed sensing theory, that effectively removes the false positive answers. We first obtain a sparse representation for each data point by the landmark based approach, after which we solve the nearly duplicate search that the difference between the query and its nearest neighbors forms a sparse vector living in a small โp ball, where p โค 1. Our empirical study on real-world datasets demonstrates the effectiveness of the proposed approach compared to the state-of-the-art hashing methods.
Time-Critical Influence Maximization in Social Networks with Time-Delayed Diffusion Process
Chen, Wei (Microsoft Research Asia) | Lu, Wei (University of British Columbia) | Zhang, Ning (University of Science and Technology of China)
Influence maximization is a problem of finding a small set of highly influential users in a social network such that the spread of influence under certain propagation models is maximized. In this paper, we consider time-critical influence maximization, in which one wants to maximize influence spread within a given deadline. Since timing is considered in the optimization, we also extend the Independent Cascade (IC) model to incorporate the time delay aspect of influence diffusion in social networks. We show that time-critical influence maximization under the time-delayed IC model maintains desired properties such as submodularity, which allows a greedy algorithm to achieve an approximation ratio of 1-1/e, to circumvent the NP-hardness of the problem. To overcome the inefficiency of the approximation algorithm, we design two heuristic algorithms: the first one is based on a dynamic programming procedure that computes exact influence in tree structures, while the second one converts the problem to one in the original IC model and then applies existing fast heuristics to it. Our simulation results demonstrate that our heuristics achieve the same level of influence spread as the greedy algorithm while running a few orders of magnitude faster, and they also outperform existing algorithms that disregard the deadline constraint and delays in diffusion.
Lessons Learned From a Rational Reconstruction of Minstrel
Tearse, Brandon Robert (University of California, Santa Cruz) | Mawhorter, Peter (University of California, Santa Cruz) | Mateas, Michael (University of California, Santa Cruz) | Wardrip-Fruin, Noah (University of California, Santa Cruz)
Scott Turner's 1993 Minstrel system was a high water mark in story generation, harnessing the concept of imaginative recall to generate creative stories. Using case-based reasoning and an author level planning system, Minstrel models human creative processes. However, the algorithmic and representational commitments made in Minstrel were never subject to principled and quantitative analysis. By rationally reconstructing Minstrel, we are able to investigate Turner's computational model of creativity and learn new lessons about his architecture. We find that Minstrel's original performance was tied to a well-groomed case library, but by modifying several components of the algorithm we can create a more general version which can construct stories using a sparser and less structured case library. Through a rational reconstruction of Minstrel, we both learn new architectural and algorithmic lessons about Minstrelโs computational model of creativity as well as make his architecture available to the contemporary research community for further experimentation.
A Data-Driven Approach to Question Subjectivity Identification in Community Question Answering
Zhou, Tom Chao (The Chinese University of Hong Kong) | Si, Xiance (Google) | Chang, Edward Y. (Google) | King, Irwin (ATT) | Lyu, Michael R. (The Chinese University of Hong Kong)
Automatic Subjective Question Answering (ASQA), which aims at answering users'subjective questions using summaries of multiple opinions, becomes increasingly important. One challenge of ASQA is that expected answers for subjective questions may not readily exist in the Web. The rising and popularity of Community Question Answering (CQA) sites, which provide platforms for people to post and answer questions, provides an alternative to ASQA. One important task of ASQA is question subjectivity identification, which identifies whether a user is asking a subjective question. Unfortunately, there has been little labeled training data available for this task. In this paper, we propose an approach to collect training data automatically by utilizing social signals in CQA sites without involving any manual labeling. Experimental results show that our data-driven approach achieves 9.37% relative improvement over the supervised approach using manually labeled data, and achieves 5.15% relative gain over a state-of-the-art semi-supervised approach. In addition, we propose several heuristic features for question subjectivity identification. By adding these features, we achieve 11.23% relative improvement over word n-gram feature under the same experimental setting.
Fine-Grained Entity Recognition
Ling, Xiao (University of Washington) | Weld, Daniel S. (University of Washington)
Entity Recognition (ER) is a key component of relation extraction systems and many other natural-language processing applications. Unfortunately, most ER systems are restricted to produce labels from to a small set of entity classes, e.g., person, organization, location or miscellaneous. In order to intelligently understand text and extract a wide range of information, it is useful to more precisely determine the semantic classes of entities mentioned in unstructured text. This paper defines a fine-grained set of 112 tags, formulates the tagging problem as multi-class, multi-label classification, describes an unsupervised method for collecting training data, and presents the FIGER implementation. Experiments show that the system accurately predicts the tags for entities. Moreover, it provides useful information for a relation extraction system, increasing the F1 score by 93%. We make FIGER and its data available as a resource for future work.
Predictive Mining of Comparable Entities from the Web
Jang, Myungha (Pohang University of Science and Technology (POSTECH)) | Park, Jin-woo (Pohang University of Science and Technology (POSTECH)) | Hwang, Seung-won (Pohang University of Science and Technology (POSTECH))
Comparing entities is an important part of decision making. Several approaches have been reported for mining comparable entities from Web sources to improve user experience in comparing entities online.However, these efforts extract only entities explicitly compared in the corpora, and may exclude entities that occur less-frequently but potentially comparable. To build a more complete comparison machine that can infer such missing relations, here we develop a solutionto predict transitivity of known comparable relations. Named CliqueGrow, our approach predicts missing links given a comparable entity graph obtained from versus query logs. Our approach achieved the highest F1-score among five link prediction approaches and a commercial comparison engine provided by Yahoo!.
Combining Hashing and Abstraction in Sparse High Dimensional Feature Spaces
Caragea, Cornelia (Pennsylvania State University) | Silvescu, Adrian (Naviance Inc.) | Mitra, Prasenjit (Pennsylvania State University)
With the exponential increase in the number of documents available online, e.g., news articles, weblogs, scientific documents, the development of effective and efficient classification methods is needed. The performance of document classifiers critically depends, among other things, on the choice of the feature representation. The commonly used "bag of words" and n-gram representations can result in prohibitively high dimensional input spaces. Data mining algorithms applied to these input spaces may be intractable due to the large number of dimensions. Thus, dimensionality reduction algorithms that can process data into features fast at runtime, ideally in constant time per feature, are greatly needed in high throughput applications, where the number of features and data points can be in the order of millions. One promising line of research to dimensionality reduction is feature clustering. We propose to combine two types of feature clustering, namely hashing and abstraction based on hierarchical agglomerative clustering, in order to take advantage of the strengths of both techniques. Experimental results on two text data sets show that the combined approach uses significantly smaller number of features and gives similar performance when compared with the "bag of words" and n-gram approaches.
Symmetry Breaking Constraints: Recent Results
Walsh, Toby (NICTA and University of New South Wales)
Symmetry is an important problem in many combinatorial problems. One way of dealing with symmetry is to add constraints that eliminate symmetric solutions. We survey recent results in this area, focusing especially on two common and useful cases: symmetry breaking constraints for row and column symmetry, and symmetry breaking constraints for eliminating value symmetry.