AITopics | Grammars & Parsing

This paper focuses on a system, WOLFIE (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of phrases paired with meaning representations. WOLFIE is part of an integrated system that learns to transform sentences into representations such as logical database queries. Experimental results are presented demonstrating WOLFIE's ability to learn useful lexicons for a database interface in four different natural languages. The usefulness of the lexicons learned by WOLFIE are compared to those acquired by a similar system, with results favorable to WOLFIE. A second set of experiments demonstrates WOLFIE's ability to scale to larger and more difficult, albeit artificially generated, corpora. In natural language acquisition, it is difficult to gather the annotated data needed for supervised learning; however, unannotated data is fairly plentiful. Active learning methods attempt to select for annotation and training only the most informative examples, and therefore are potentially very useful in natural language applications. However, most results to date for active learning have only considered standard classification tasks. To reduce annotation effort while maintaining accuracy, we apply active learning to semantic lexicons. We show that active learning can significantly reduce the number of annotated examples required to achieve a given level of performance.

corpus, lexicon, representation, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1063

AI Access Foundation

10320

Journal of Artificial Intelligence Research

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)
(20 more...)

Genre: Research Report > New Finding (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Convolution Kernels for Natural Language

Collins, Michael, Duffy, Nigel

Neural Information Processing SystemsDec-31-2002

We describe the application of kernel methods to Natural Language Processing (NLP) problems. In many NLP tasks the objects being modeled are strings, trees, graphs or other discrete structures which require some mechanism to convert them into feature vectors. We describe kernels for various natural language structures, allowing rich, high dimensional representations of these structures. We show how a kernel over trees can be applied to parsing using the voted perceptron algorithm, and we give experimental results on the ATIS corpus of parse trees.

algorithm, convolution kernel, kernel, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.05)
North America > United States > New Jersey (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Grammatical Bigrams

Paskin, Mark A.

Neural Information Processing SystemsDec-31-2002

Unsupervised learning algorithms have been derived for several statistical models of English grammar, but their computational complexity makes applying them to large data sets intractable. This paper presents a probabilistic model of English grammar that is much simpler than conventional models, but which admits an efficient EM training algorithm. The model is based upon grammatical bigrams, i.e., syntactic relationships between pairs of words. We present the results of experiments that quantify the representational adequacy of the grammatical bigram model, its ability to generalize from labelled data, and its ability to induce syntactic structure from large amounts of raw text. 1 Introduction One of the most significant challenges in learning grammars from raw text is keeping the computational complexity manageable. For example, the EM algorithm for the unsupervised training of Probabilistic Context-Free Grammars-known as the Inside-Outside algorithm-has been found in practice to be "computationally intractable for realistic problems" [1].

algorithm, grammatical bigram model, parse, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.28)
North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > Massachusetts (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Natural Language Grammar Induction Using a Constituent-Context Model

Klein, Dan, Manning, Christopher D.

Neural Information Processing SystemsDec-31-2002

This paper presents a novel approach to the unsupervised learning of syntactic analyses of natural language text.

constituent, grammar, sequence, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > New York (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Convolution Kernels for Natural Language

Collins, Michael, Duffy, Nigel

Neural Information Processing SystemsDec-31-2002

We describe the application of kernel methods to Natural Language Processing (NLP) problems. In many NLP tasks the objects being modeled are strings, trees, graphs or other discrete structures which require some mechanism to convert them into feature vectors. We describe kernels for various natural language structures, allowing rich, high dimensional representations of these structures. We show how a kernel over trees can be applied to parsing using the voted perceptron algorithm, and we give experimental results on the ATIS corpus of parse trees.

algorithm, convolution kernel, kernel, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.05)
North America > United States > New Jersey (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Grammatical Bigrams

Paskin, Mark A.

Neural Information Processing SystemsDec-31-2002

Unsupervised learning algorithms have been derived for several statistical models of English grammar, but their computational complexity makes applying them to large data sets intractable. This paper presents a probabilistic model of English grammar that is much simpler than conventional models, but which admits an efficient EM training algorithm. The model is based upon grammatical bigrams, i.e., syntactic relationships between pairs of words. We present the results of experiments that quantify the representational adequacy of the grammatical bigram model, its ability to generalize from labelled data, and its ability to induce syntactic structure from large amounts of raw text. 1 Introduction One of the most significant challenges in learning grammars from raw text is keeping the computational complexity manageable. For example, the EM algorithm for the unsupervised training of Probabilistic Context-Free Grammars-known as the Inside-Outside algorithm-has been found in practice to be "computationally intractable for realistic problems" [1].

algorithm, grammatical bigram model, parse, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.28)
North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > Massachusetts (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Natural Language Grammar Induction Using a Constituent-Context Model

Klein, Dan, Manning, Christopher D.

Neural Information Processing SystemsDec-31-2002

This paper presents a novel approach to the unsupervised learning of syntactic analyses of natural language text.

constituent, grammar, sequence, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > New York (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Convolution Kernels for Natural Language

Collins, Michael, Duffy, Nigel

Neural Information Processing SystemsDec-31-2002

We describe the application of kernel methods to Natural Language Processing (NLP)problems. In many NLP tasks the objects being modeled are strings, trees, graphs or other discrete structures which require some mechanism to convert them into feature vectors. We describe kernels for various natural language structures, allowing rich, high dimensional representations ofthese structures. We show how a kernel over trees can be applied to parsing using the voted perceptron algorithm, and we give experimental results on the ATIS corpus of parse trees.

algorithm, convolution kernel, kernel, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.05)
North America > United States > New Jersey (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Grammatical Bigrams

Paskin, Mark A.

Neural Information Processing SystemsDec-31-2002

Unsupervised learning algorithms have been derived for several statistical modelsof English grammar, but their computational complexity makesapplying them to large data sets intractable. This paper presents a probabilistic model of English grammar that is much simpler than conventional models, but which admits an efficient EMtraining algorithm. The model is based upon grammatical bigrams, i.e., syntactic relationships between pairs of words. We present the results of experiments that quantify the representational adequacyof the grammatical bigram model, its ability to generalize from labelled data, and its ability to induce syntactic structure from large amounts of raw text. 1 Introduction One of the most significant challenges in learning grammars from raw text is keeping thecomputational complexity manageable. For example, the EM algorithm for the unsupervised training of Probabilistic Context-Free Grammars-known as the Inside-Outside algorithm-has been found in practice to be "computationally intractable for realistic problems" [1].

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: