Education
Assertion-Based QA With Question-Aware Open Information Extraction
Yan, Zhao (Beihang University) | Tang, Duyu (Microsoft Research Asia) | Duan, Nan (Microsoft Research Asia) | Liu, Shujie (Microsoft Research Asia) | Wang, Wendi (Microsoft) | Jiang, Daxin (Microsoft) | Zhou, Ming (Microsoft Research Asia) | Li, Zhoujun (Beihang University)
We present assertion based question answering (ABQA), an open domain question answering task that takes a question and a passage as inputs, and outputs a semi-structured assertion consisting of a subject, a predicate and a list of arguments. An assertion conveys more evidences than a short answer span in reading comprehension, and it is more concise than a tedious passage in passage-based QA. These advantages make ABQA more suitable for human-computer interaction scenarios such as voice-controlled speakers. Further progress towards improving ABQA requires richer supervised dataset and powerful models of text understanding. To remedy this, we introduce a new dataset called WebAssertions, which includes hand-annotated QA labels for 358,427 assertions in 55,960 web passages. To address ABQA, we develop both generative and extractive approaches. The backbone of our generative approach is sequence to sequence learning. In order to capture the structure of the output assertion, we introduce a hierarchical decoder that first generates the structure of the assertion and then generates the words of each field. The extractive approach is based on learning to rank. Features at different levels of granularity are designed to measure the semantic relevance between a question and an assertion. Experimental results show that our approaches have the ability to infer question-aware assertions from a passage. We further evaluate our approaches by incorporating the ABQA results as additional features in passage-based QA. Results on two datasets show that ABQA features significantly improve the accuracy on passage-based QA.
Dynamic User Profiling for Streams of Short Texts
Liang, Shangsong (University College London)
In this paper, we aim at tackling the problem of dynamic user profiling in the context of streams of short texts. Profiling users' expertise in such context is more challenging than in the case of long documents in static collection as it is difficult to track users' dynamic expertise in streaming sparse data. To obtain better profiling performance, we propose a streaming profiling algorithm (SPA). SPA first utilizes the proposed user expertise tracking topic model (UET) to track the changes of users' dynamic expertise and then utilizes the proposed streaming keyword diversification algorithm (SKDA) to produce top-k diversified keywords for profiling users' dynamic expertise at a specific point in time. Experimental results validate the effectiveness of the proposed algorithms.
Inference on Syntactic and Semantic Structures for Machine Comprehension
Li, Chenrui (East China Normal University) | Wu, Yuanbin (East China Normal University) | Lan, Man (East China Normal University)
Hidden variable models are important tools for solving open domain machine comprehension tasks and have achieved remarkable accuracy in many question answering benchmark datasets. Existing models impose strong independence assumptions on hidden variables, which leaves the interaction among them unexplored. Here we introduce linguistic structures to help capturing global evidence in hidden variable modeling. In the proposed algorithms, question-answer pairs are scored based on structured inference results on parse trees and semantic frames, which aims to assign hidden variables in a global optimal way. Experiments on the MCTest dataset demonstrate that the proposed models are highly competitive with state-of-the-art machine comprehension systems.
Learning With Single-Teacher Multi-Student
You, Shan (Peking University) | Xu, Chang (University of Sydney) | Xu, Chao (Peking University) | Tao, Dacheng (University of Sydney)
In this paper we study a new learning problem defined as "Single-Teacher Multi-Student" (STMS) problem, which investigates how to learn a series of student (simple and specific) models from a single teacher (complex and universal) model. Taking the multiclass and binary classification for example, we focus on learning multiple binary classifiers from a single multiclass classifier, where each of binary classifier is responsible for a certain class. This actually derives from some realistic problems, such as identifying the suspect based on a comprehensive face recognition system. By treating the already-trained multiclass classifier as the teacher, and multiple binary classifiers as the students, we propose a gated support vector machine (gSVM) as a solution. A series of gSVMs are learned with the help of single teacher multiclass classifier. The teacher's help is two-fold; first, the teacher's score provides the gated values for students' decision; second, the teacher can guide the students to accommodate training examples with different difficulty degrees. Extensive experiments on real datasets validate its effectiveness.
Randomized Clustered Nystrom for Large-Scale Kernel Machines
Pourkamali-Anaraki, Farhad (University of Colorado Boulder) | Becker, Stephen (University of Colorado Boulder) | Wakin, Michael B. (Colorado School of Mines)
The Nystrom method is a popular technique for generating low-rank approximations of kernel matrices that arise in many machine learning problems. The approximation quality of the Nystrom method depends crucially on the number of selected landmark points and the selection procedure. In this paper, we introduce a randomized algorithm for generating landmark points that is scalable to large high-dimensional data sets. The proposed method performs K-means clustering on low-dimensional random projections of a data set and thus leads to significant savings for high-dimensional data sets. Our theoretical results characterize the tradeoffs between accuracy and efficiency of the proposed method. Moreover, numerical experiments on classification and regression tasks demonstrate the superior performance and efficiency of our proposed method compared with existing approaches.
Differential Performance Debugging With Discriminant Regression Trees
Tizpaz-Niari, Saeid (University of Colorado Boulder) | Cerny, Pavol (University of Colorado Boulder) | Chang, Bor-Yuh Evan (University of Colorado Boulder) | Trivedi, Ashutosh (University of Colorado Boulder)
Differential performance debugging is a technique to find performance problems. It applies in situations where the performance of a program is (unexpectedly) different for different classes of inputs. The task is to explain the differences in asymptotic performance among various input classes in terms of program internals. We propose a data-driven technique based on discriminant regression tree (DRT) learning problem where the goal is to discriminate among different classes of inputs. We propose a new algorithm for DRT learning that first clusters the data into functional clusters, capturing different asymptotic performance classes, and then invokes off-the-shelf decision tree learning algorithms to explain these clusters. We focus on linear functional clusters and adapt classical clustering algorithms (K-means and spectral) to produce them. For the K-means algorithm, we generalize the notion of the cluster centroid from a point to a linear function. We adapt spectral clustering by defining a novel kernel function to capture the notion of linear similarity between two data points. We evaluate our approach on benchmarks consisting of Java programs where we are interested in debugging performance. We show that our algorithm significantly outperforms other well-known regression tree learning algorithms in terms of running time and accuracy of classification.
Group-Pair Convolutional Neural Networks for Multi-View Based 3D Object Retrieval
Gao, Zan (Tianjin University of Technology) | Wang, Deyu (Tianjin University of Technology) | He, Xiangnan (National University of Singapore) | Zhang, Hua (Tianjin University of Technology)
In recent years, research interest in object retrieval has shifted from 2D towards 3D data. Despite many well-designed approaches, we point out that limitations still exist and there is tremendous room for improvement, including the heavy reliance on hand-crafted features, the separated optimization of feature extraction and object retrieval, and the lack of sufficient training samples. In this work, we address the above limitations for 3D object retrieval by developing a novel end-to-end solution named Group Pair Convolutional Neural Network (GPCNN). It can jointly learn the visual features from multiple views of a 3D model and optimize towards the object retrieval task. To tackle the insufficient training data issue, we innovatively employ a pair-wise learning scheme, which learns model parameters from the similarity of each sample pair, rather than the traditional way of learning from sparse labelโsample matching. Extensive experiments on three public benchmarks show that our GPCNN solution significantly outperforms the state-of-the-art methods with 3% to 42% improvement in retrieval accuracy.
Question Answering as Global Reasoning Over Semantic Abstractions
Khashabi, Daniel (University of Pennsylvania) | Khot, Tushar (Allen Institute for Artificial Intelligence) | Sabharwal, Ashish (Allen Institute for Artificial Intelligence) | Roth, Dan (University of Pennsylvania)
We propose a novel method for exploiting the semantic structure of text to answer multiple-choice questions. The approach is especially suitable for domains that require reasoning over a diverse set of linguistic constructs but have limited training data. To address these challenges, we present the first system, to the best of our knowledge, that reasons over a wide range of semantic abstractions of the text, which are derived using off-the-shelf, general-purpose, pre-trained natural language modules such as semantic role labelers, coreference resolvers, and dependency parsers. Representing multiple abstractions as a family of graphs, we translate question answering (QA) into a search for an optimal subgraph that satisfies certain global and local properties. This formulation generalizes several prior structured QA systems. Our system, SEMANTICILP, demonstrates strong performance on two domains simultaneously. In particular, on a collection of challenging science QA datasets, it outperforms various state-of-the-art approaches, including neural models, broad coverage information retrieval, and specialized techniques using structured knowledge bases, by 2%-6%.
WiFi-Based Human Identification via Convex Tensor Shapelet Learning
Zou, Han (University of California, Berkeley) | Zhou, Yuxun (University of California, Berkeley) | Yang, Jianfei (Nanyang Technological University) | Gu, Weixi (Tsinghua University) | Xie, Lihua (Nanyang Technological University) | Spanos, Costas J. (University of California, Berkeley)
We propose AutoID, a human identification system that leverages the measurements from existing WiFi-enabled Internet of Things (IoT) devices and produces the identity estimation via a novel sparse representation learning technique. The key idea is to use the unique fine-grained gait patterns of each person revealed from the WiFi Channel State Information (CSI) measurements, technically referred to as shapelet signatures, as the "fingerprint" for human identification. For this purpose, a novel OpenWrt-based IoT platform is designed to collect CSI data from commercial IoT devices. More importantly, we propose a new optimization-based shapelet learning framework for tensors, namely Convex Clustered Concurrent Shapelet Learning (C3SL), which formulates the learning problem as a convex optimization. The global solution of C3SL can be obtained efficiently with a generalized gradient-based algorithm, and the three concurrent regularization terms reveal the inter-dependence and the clustering effect of the CSI tensor data. Extensive experiments are conducted in multiple real-world indoor environments, showing that AutoID achieves an average human identification accuracy of 91% from a group of 20 people. As a combination of novel sensing and learning platform, AutoID attains substantial progress towards a more accurate, cost-effective and sustainable human identification system for pervasive implementations.
Anchors: High-Precision Model-Agnostic Explanations
Ribeiro, Marco Tulio (University of Washington) | Singh, Sameer (University of California, Irvine) | Guestrin, Carlos (University of Washington)
We introduce a novel model-agnostic system that explains the behavior of complex models with high-precision rules called anchors, representing local, "sufficient" conditions for predictions. We propose an algorithm to efficiently compute these explanations for any black-box model with high-probability guarantees. We demonstrate the flexibility of anchors by explaining a myriad of different models for different domains and tasks. In a user study, we show that anchors enable users to predict how a model would behave on unseen instances with less effort and higher precision, as compared to existing linear explanations or no explanations.