Asia
What Is the Longest River in the USA? Semantic Parsing for Aggregation Questions
Xu, Kun (Peking University) | Zhang, Sheng (Peking University) | Feng, Yansong (Peking University) | Huang, Songfang (IBM China Research Lab) | Zhao, Dongyan (Peking University)
Answering natural language questions against structured knowledge bases (KB) has been attracting increasing attention in both IR and NLP communities. The task involves two main challenges: recognizing the questions' meanings, which are then grounded to a given KB. Targeting simple factoid questions, many existing open domain semantic parsers jointly solve these two subtasks, but are usually expensive in complexity and resources.In this paper, we propose a simple pipeline framework to efficiently answer more complicated questions, especially those implying aggregation operations, e.g., argmax, argmin.We first develop a transition-based parsing model to recognize the KB-independent meaning representation of the user's intention inherent in the question. Secondly, we apply a probabilistic model to map the meaning representation, including those aggregation functions, to a structured query.The experimental results showed that our method can better understand aggregation questions, outperforming the state-of-the-art methods on the Free917 dataset while still maintaining promising performance on a more challenging dataset, WebQuestions, without extra training.
Loss-Calibrated Monte Carlo Action Selection
Abbasnejad, Ehsan (Australian National University and NICTA) | Domke, Justin (Australian National University and NICTA) | Sanner, Scott (Australian National University and NICTA)
Bayesian decision-theory underpins robust decision-making in applications ranging from plant control to robotics where hedging action selection against state uncertainty is critical for minimizing low probability but potentially catastrophic outcomes (e.g, uncontrollable plant conditions or robots falling into stairwells). Unfortunately, belief state distributions in such settings are often complex and/or high dimensional, thus prohibiting the efficient application of analytical techniques for expected utility computation when real-time control is required. This leaves Monte Carlo evaluation as one of the few viable (and hence frequently used) techniques for online action selection. However, loss-insensitive Monte Carlo methods may require large numbers of samples to identify optimal actions with high certainty since they may sample from highprobability regions that do not disambiguate action utilities. In this paper we remedy this problem by deriving an optimal proposal distribution for a loss-calibrated Monte Carlo importance sampler that bounds the regret of using an estimated optimal action. Empirically, we show that using our loss-calibrated Monte Carlo method yields high-accuracy optimal action selections in a fraction of the number of samples required by conventional loss-insensitive samplers.
Multi-tensor Completion with Common Structures
Li, Chao (Harbin Engineering University) | Zhao, Qibin (Riken) | Li, Junhua (Riken) | Cichocki, Andrzej (Riken) | Guo, Lili (Harbin Engineering University)
In multi-data learning, it is usually assumed that common latent factors exist among multi-datasets, but it may lead to deteriorated performance when datasets are heterogeneous and unbalanced. In this paper, we propose a novel common structure for multi-data learning. Instead of common latent factors, we assume that datasets share Common Adjacency Graph (CAG) structure, which is more robust to heterogeneity and unbalance of datasets. Furthermore, we utilize CAG structure to develop a new method for multi-tensor completion, which exploits the common structure in datasets to improve the completion performance. Numerical results demostrate that the proposed method not only outperforms state-of-the-art methods for video in-painting, but also can recover missing data well even in cases that conventional methods are not applicable.
Support Consistency of Direct Sparse-Change Learning in Markov Networks
Liu, Song (Tokyo Institute of Technology, Japan) | Suzuki, Taiji (Tokyo Institute of Technology, Japan) | Sugiyama, Masashi (University of Tokyo, Japan)
We study the problem of learning sparse structure changes between two Markov networks P and Q. Rather than fitting two Markov networks separately to two sets of data and figuring out their differences, a recent work proposed to learn changes directly via estimating the ratio between two Markov network models. ย Such a direct approach was demonstrated to perform excellently in experiments, although its theoretical properties remained unexplored. ย In this paper, we give sufficient conditions for successful change detection with respect to the sample size np, nq, the dimension of data m, and the number of changed edges d.
Touchless Telerobotic Surgery โ Is It Possible at All?
Zhou, Tian (Purdue University) | Cabrera, Maria Eugenia (Purdue University) | Wachs, Juan Pablo (Purdue University)
Teleoperated robot-assisted surgery (RAS) is becoming more popular in certain types of surgical procedures due to its dexterity, precision, high-resolution, accurate motion planning and execution capabilities, compared to traditional minimally invasive surgery which relies on hindered laparoscopic control. The most widely adopted system based on this paradigm is the daVinci robot (2014), in which the surgeon manipulates joysticks in a master console using 3D imaging for guidance. Then robotic arms mimic the surgeon's movements on the patient's side.
Using Frame Semantics for Knowledge Extraction from Twitter
Sรธgaard, Anders (University of Copenhagen) | Plank, Barbara (University of Copenhagen) | Alonso, Hector Martinez (University of Copenhagen)
Knowledge bases have the potential to advance artificial intelligence, but often suffer from recall problems, i.e., lack of knowledge of new entities and relations. On the contrary, social media such as Twitter provide abundance of data, in a timely manner: information spreads at an incredible pace and is posted long before it makes it into more commonly used resources for knowledge extraction. In this paper we address the question whether we can exploit social media to extract new facts, which may at first seem like finding needles in haystacks. We collect tweets about 60 entities in Freebase and compare four methods to extract binary relation candidates, based on syntactic and semantic parsing and simple mechanism for factuality scoring. The extracted facts are manually evaluated in terms of their correctness and relevance for search. We show that moving from bottom-up syntactic or semantic dependency parsing formalisms to top-down frame-semantic processing improves the robustness of knowledge extraction, producing more intelligible fact candidates of better quality. In order to evaluate the quality of frame semantic parsing on Twitter intrinsically, we make a multiply frame-annotated dataset of tweets publicly available.
Language Independent Feature Extractor
Jeong, Young-Seob (Korea Advanced Institute of Science and Technology (KAIST)) | Choi, Ho-Jin (Korea Advanced Institute of Science and Technology (KAIST))
We propose a new customizable tool, Language Independent Feature Extractor (LIFE), which models the inherent patterns of any language and extracts relevant features of thelanguage. There are two contributions of this work: (1) no labeled data is necessary to train LIFE (It works when a sufficient number of unlabeled documents are given), and (2) LIFE is designed to be applicable to any language. We proved the usefulness of LIFE by experimental results of time information extraction.
Generalized Singular Value Thresholding
Lu, Canyi (National University of Singapore) | Zhu, Changbo (National University of Singapore) | Xu, Chunyan (Huazhong University of Science and Technology) | Yan, Shuicheng (National University of Singapore) | Lin, Zhouchen (Peking University)
This work studies the Generalized Singular Value Thresholding (GSVT) operator associated with a nonconvex function g defined on the singular values of X. We prove that GSVT can be obtained by performing the proximal operator of g on the singular values since Proxg(.) is monotone when g is lower bounded. If the nonconvex g satisfies some conditions (many popular nonconvex surrogate functions, e.g., lp-norm, 0 < p < 1, of l0-norm are special cases), a general solver to find Proxg(b) is proposed for any b โฅ 0. GSVT greatly generalizes the known Singular Value Thresholding (SVT) which is a basic subroutine in many convex low rank minimization methods. We are able to solve the nonconvex low rank minimization problem by using GSVT in place of SVT.
Absent Multiple Kernel Learning
Liu, Xinwang (National University of Defense Technology) | Wang, Lei (University of Wollongong) | Yin, Jianping (National University of Defense Technology) | Dou, Yong (National University of Defense Technology) | Zhang, Jian (University of Technology Sydney)
Multiple kernel learning (MKL) optimally combines the multiple channels of each sample to improve classification performance. However, existing MKL algorithms cannot effectively handle the situation where some channels are missing, which is common in practical applications. This paper proposes an absent MKL (AMKL) algorithm to address this issue. Different from existing approaches where missing channels are firstly imputed and then a standard MKL algorithm is deployed on the imputed data, our algorithm directly classifies each sample with its observed channels. In specific, we define a margin for each sample in its own relevant space, which corresponds to the observed channels of that sample. The proposed AMKL algorithm then maximizes the minimum of all sample-based margins, and this leads to a difficult optimization problem. We show that this problem can be reformulated as a convex one by applying the representer theorem. This makes it readily be solved via existing convex optimization packages. Extensive experiments are conducted on five MKL benchmark data sets to compare the proposed algorithm with existing imputation-based methods. As observed, our algorithm achieves superior performance and the improvement is more significant with the increasing missing ratio.
Learning Word Vectors Efficiently Using Shared Representations and Document Representations
Luo, Qun (Beijing University of Posts and Telecommunications) | Xu, Weiran (Beijing University of Posts and Telecommunications)
We propose some better word embedding models based on vLBL model and ivLBL model by sharing representations between context and target words and using document representations. Our proposed models are much simpler which have almost half less parameters than the state-of-the-art methods. We achieve better results on word analogy task than the best ones reported before using significantly less training data and computing time.