Grammars & Parsing
Unsupervised Structure Learning of Stochastic And-Or Grammars
Tu, Kewei, Pavlovskaia, Maria, Zhu, Song-Chun
Stochastic And-Or grammars compactly represent both compositionality and reconfigurability and have been used to model different types of data such as images and events. We present a unified formalization of stochastic And-Or grammars that is agnostic to the type of the data being modeled, and propose an unsupervised approach to learning the structures as well as the parameters of such grammars. Starting from a trivial initial grammar, our approach iteratively induces compositions and reconfigurations in a unified manner and optimizes the posterior probability of the grammar. In our empirical evaluation, we applied our approach to learning event grammars and image grammars and achieved comparable or better performance than previous approaches.
Description Logics based Formalization of Wh-Queries
Dasgupta, Sourish, KaPatel, Rupali, Padia, Ankur, Shah, Kushal
The problem of Natural Language Query Formalization (NLQF) is to translate a given user query in natural language (NL) into a formal language () so that the semantic interpretation has equivalence with the NL interpretation. Formalization of NL queries enables logic based reasoning during information retrieval, database query, question-answering, etc. Formalization also helps in Web query normalization and indexing, query intent analysis, etc. In this paper we are proposing a Description Logics based formal methodology for wh-query intent (also called desire) identification and corresponding formal translation. We evaluated the scalability of our proposed formalism using Microsoft Encarta 98 query dataset and OWLS TC v.4.0 dataset.
Viterbi training in PRISM
Sato, Taisuke, Kubota, Keiichi
VT (Viterbi training), or hard EM, is an efficient way of parameter learning for probabilistic models with hidden variables. Given an observation $y$, it searches for a state of hidden variables $x$ that maximizes $p(x,y \mid \theta)$ by coordinate ascent on parameters $\theta$ and $x$. In this paper we introduce VT to PRISM, a logic-based probabilistic modeling system for generative models. VT improves PRISM in three ways. First VT in PRISM converges faster than EM in PRISM due to the VT's termination condition. Second, parameters learned by VT often show good prediction performance compared to those learned by EM. We conducted two parsing experiments with probabilistic grammars while learning parameters by a variety of inference methods, i.e.\ VT, EM, MAP and VB. The result is that VT achieved the best parsing accuracy among them in both experiments. Also we conducted a similar experiment for classification tasks where a hidden variable is not a prediction target unlike probabilistic grammars. We found that in such a case VT does not necessarily yield superior performance. Third since VT always deals with a single probability of a single explanation, Viterbi explanation, the exclusiveness condition that is imposed on PRISM programs is no more required if we learn parameters by VT. Last but not least we can say that as VT in PRISM is general and applicable to any PRISM program, it largely reduces the need for the user to develop a specific VT algorithm for a specific model. Furthermore since VT in PRISM can be used just by setting a PRISM flag appropriately, it makes VT easily accessible to (probabilistic) logic programmers. To appear in Theory and Practice of Logic Programming (TPLP).
Unsupervised Sub-tree Alignment for Tree-to-Tree Translation
This article presents a probabilistic sub-tree alignment model and its application to tree-to-tree machine translation. Unlike previous work, we do not resort to surface heuristics or expensive annotated data, but instead derive an unsupervised model to infer the syntactic correspondence between two languages. More importantly, the developed model is syntactically-motivated and does not rely on word alignments. As a by-product, our model outputs a sub-tree alignment matrix encoding a large number of diverse alignments between syntactic structures, from which machine translation systems can efficiently extract translation rules that are often filtered out due to the errors in 1-best alignment. Experimental results show that the proposed approach outperforms three state-of-the-art baseline approaches in both alignment accuracy and grammar quality. When applied to machine translation, our approach yields a +1.0 BLEU improvement and a -0.9 TER reduction on the NIST machine translation evaluation corpora. With tree binarization and fuzzy decoding, it even outperforms a state-of-the-art hierarchical phrase-based system.
A Global Model for Concept-to-Text Generation
Concept-to-text generation refers to the task of automatically producing textual output from non-linguistic input. We present a joint model that captures content selection ("what to say") and surface realization ("how to say") in an unsupervised domain-independent fashion. Rather than breaking up the generation process into a sequence of local decisions, we define a probabilistic context-free grammar that globally describes the inherent structure of the input (a corpus of database records and text describing some of them). We recast generation as the task of finding the best derivation tree for a set of database records and describe an algorithm for decoding in this framework that allows to intersect the grammar with additional information capturing fluency and syntactic well-formedness constraints. Experimental evaluation on several domains achieves results competitive with state-of-the-art systems that use domain specific constraints, explicit feature engineering or labeled data.
Learning Lambek grammars from proof frames
Bonato, Roberto, Retorรฉ, Christian
In addition to their limpid interface with semantics, categorial grammars enjoy another important property: learnability. This was first noticed by Buskowsky and Penn and further studied by Kanazawa, for Bar-Hillel categorial grammars. What about Lambek categorial grammars? In a previous paper we showed that product free Lambek grammars where learnable from structured sentences, the structures being incomplete natural deductions. These grammars were shown to be unlearnable from strings by Foret and Le Nir. In the present paper we show that Lambek grammars, possibly with product, are learnable from proof frames that are incomplete proof nets. After a short reminder on grammatical inference \`a la Gold, we provide an algorithm that learns Lambek grammars with product from proof frames and we prove its convergence. We do so for 1-valued also known as rigid Lambek grammars with product, since standard techniques can extend our result to $k$-valued grammars. Because of the correspondence between cut-free proof nets and normal natural deductions, our initial result on product free Lambek grammars can be recovered. We are sad to dedicate the present paper to Philippe Darondeau, with whom we started to study such questions in Rennes at the beginning of the millennium, and who passed away prematurely. We are glad to dedicate the present paper to Jim Lambek for his 90 birthday: he is the living proof that research is an eternal learning process.
Statistical Parsing with Probabilistic Symbol-Refined Tree Substitution Grammars
Shindo, Hiroyuki (NTT Communication Science Laboratories) | Miyao, Yusuke (National Institute of Informatics) | Fujino, Akinori (NTT Communication Science Laboratories) | Nagata, Masaaki (NTT Communication Science Laboratories)
We present probabilistic Symbol-Refined Tree Substitution Grammars (SR-TSG) for statistical parsing of natural language sentences. An SR-TSG is an extension of the conventional TSG model where each nonterminal symbol can be refined (subcategorized) to fit the training data. Our probabilistic model is consistent based on the hierarchical Pitman-Yor Process to encode backoff smoothing from a fine-grained SR-TSG to simpler CFG rules, thus all grammar rules can be learned from training data in a fully automatic fashion. Our SR-TSG parser achieves the state-of-the-art performance on the Wall Street Journal (WSJ) English Penn Treebank data.
Learning Visual Symbols for Parsing Human Poses in Images
Wang, Fang (NICTA,ย Nanjing University of Science and Technology) | Li, Yi (NICTA)
Parsing human poses in images is fundamental in extracting critical visual information for artificial intelligent agents. Our goal is to learn self-contained body part representations from images, which we call visual symbols, and their symbol-wise geometric contexts in this parsing process. Each symbol is individually learned by categorizing visual features leveraged by geometric information. In the categorization, we use Latent Support Vector Machine followed by an efficient cross validation procedure. Then, these symbols naturally define geometric contexts of body parts in a fine granularity for effective inference. When the structure of the compositional parts is a tree, we derive an efficient approach to estimating human poses in images. Experiments on two large datasets suggest our approach outperforms state of the art methods.
Efficient Latent Structural Perceptron with Hybrid Trees for Semantic Parsing
Zhou, Junsheng (Nanjing Normal University) | Xu, Juhong (Nanjing Normal University) | Qu, Weiguang (Nanjing Normal University)
Discriminative structured prediction models have been widely used in many natural language processing tasks, but it is challenging to apply the methods to semantic parsing. In this paper, by introducing hybrid tree as a latent structure variable to close the gap between the input sentences and output representations, we formulate semantic parsing as a structured prediction problem, based on the latent variable perceptron model incorporated with a tree edit-distance loss as optimization criterion. The proposed approach maintains the advantage of a discriminative model in accommodating flexible combination of features and naturally incorporates an efficient decoding algorithm in learning and inference. Furthermore, in order to enhance the efficiency and accuracy of inference, we design an effective approach based on vector space model to extract a smaller subset of relevant MR productions for test examples. Experimental results on publicly available corpus show that our approach significantly outperforms previous systems.
Combine Constituent and Dependency Parsing via Reranking
Ren, Xiaona (City University of Hong Kong) | Chen, Xiao (City University of Hong Kong and SOGOU-INC) | Kit, Chunyu (City University of Hong Kong)
This paper presents a reranking approach to combining constituent and dependency parsing, aimed at improving parsing performance on both sides. Most previous combination methods rely on complicated joint decoding to integrate graph- and transition-based dependency models. Instead, our approach makes use of a high-performance probabilistic context free grammar (PCFG) model to output k-best candidate constituent trees, and then a dependency parsing model to rerank the trees by their scores from both models, so as to get the most probable parse. Experimental results show that this reranking approach achieves the highest accuracy of constituent and dependency parsing on Chinese treebank (CTB5.1) and a comparable performance to the state of the art on English treebank (WSJ).