AITopics | Grammars & Parsing

Wikipedia is a goldmine of information; not just for its many readers, but also for the growing community of researchers who recognize it as a resource of exceptional scale and utility. It represents a vast investment of manual effort and judgment: a huge, constantly evolving tapestry of concepts and relations that is being applied to a host of tasks. This article provides a comprehensive description of this work. It focuses on research that extracts and makes use of the concepts, relations, facts and descriptions found in Wikipedia, and organizes the work into four broad categories: applying Wikipedia to natural language processing; using it to facilitate information retrieval and information extraction; and as a resource for ontology building. The article addresses how Wikipedia is being used as is, how it is being improved and adapted, and how it is being combined with other structures to create entirely new resources. We identify the research groups and individuals involved, and how their work has developed in the last few years. We provide a comprehensive list of the open-source software they have produced.

relation, wikipedia, wikipedia article, (17 more...)

arXiv.org Artificial Intelligence

0809.4530

Country:

North America > United States > Texas (0.14)
North America > Canada > Ontario > Middlesex County > London (0.14)
Oceania > New Zealand > North Island > Waikato (0.04)
(31 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Media > Film (0.92)
Leisure & Entertainment > Sports (0.92)
(5 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(7 more...)

Add feedback

Sentence Compression as Tree Transduction

Cohn, T. A., Lapata, M.

Journal of Artificial Intelligence ResearchApr-24-2009

This paper presents a tree-to-tree transduction method for sentence compression. Our model is based on synchronous tree substitution grammar, a formalism that allows local distortion of the tree topology and can thus naturally capture structural mismatches. We describe an algorithm for decoding in this framework and show how the model can be trained discriminatively within a large margin framework. Experimental results on sentence compression bring significant improvements over a state-of-the-art model.

compression, derivation, source tree, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2655

AI Access Foundation

10600

Journal of Artificial Intelligence Research

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(18 more...)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Identification of Pleonastic It Using the Web

Li, Y., Musilek, P., Reformat, M., Wyard-Scott, L.

Journal of Artificial Intelligence ResearchMar-27-2009

In a significant minority of cases, certain pronouns, especially the pronoun it, can be used without referring to any specific entity. This phenomenon of pleonastic pronoun usage poses serious problems for systems aiming at even a shallow understanding of natural language texts. In this paper, a novel approach is proposed to identify such uses of it: the extrapositional cases are identified using a series of queries against the web, and the cleft cases are identified using a simple set of syntactic rules. The system is evaluated with four sets of news articles containing 679 extrapositional cases as well as 78 cleft constructs. The identification results are comparable to those obtained by human efforts.

category, extraposition, query, (17 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2622

AI Access Foundation

10593

Journal of Artificial Intelligence Research

Country:

Europe > United Kingdom > England > Greater London > London (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York (0.04)
(14 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Banking & Finance (1.00)
Law (0.67)
Government > Regional Government (0.45)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(3 more...)

Add feedback

Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems

Chen, Zhe, Wen, Dunwei

arXiv.org Artificial IntelligenceMar-10-2009

With the development of Natural Language Processing (NLP), more and more systems want to adopt NLP in User Interface Module to process user input, in order to communicate with user in a natural way. However, this raises a speed problem. That is, if NLP module can not process sentences in durable time delay, users will never use the system. As a result, systems which are strict with processing time, such as dialogue systems, web search systems, automatic customer service systems, especially real-time systems, have to abandon NLP module in order to get a faster system response. This paper aims to solve the speed problem. In this paper, at first, the construction of a syntactic parser which is based on corpus machine learning and statistics model is introduced, and then a speed problem analysis is performed on the parser and its algorithms. Based on the analysis, two accelerating methods, Compressed POS Set and Syntactic Patterns Pruning, are proposed, which can effectively improve the time efficiency of parsing in NLP module. To evaluate different parameters in the accelerating algorithms, two new factors, PT and RT, are introduced and explained in detail. Experiments are also completed to prove and test these methods, which will surely contribute to the application of NLP.

artificial intelligence, natural language, parser, (18 more...)

arXiv.org Artificial Intelligence

0903.0174

Country: North America > United States (0.46)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

A Bayesian LDA-based model for semi-supervised part-of-speech tagging

Toutanova, Kristina, Johnson, Mark

Neural Information Processing SystemsDec-31-2008

We present a novel Bayesian model for semi-supervised part-of-speech tagging. Our model extends the Latent Dirichlet Allocation model and incorporates the intuition that words' distributions over tags, p(t w), are sparse. In addition we introduce a model for determining the set of possible tags of a word which captures important dependencies in the ambiguity classes of words. Our model outperforms the best previously proposed model for this task on a standard dataset.

ambiguity class, ambiguity class model, word type, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Redmond (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Discriminative Log-Linear Grammars with Latent Variables

Petrov, Slav, Klein, Dan

Neural Information Processing SystemsDec-31-2008

We demonstrate that log-linear grammars with latent variables can be practically trained using discriminative methods. Central to efficient discriminative training is a hierarchical pruning procedure which allows feature expectations to be efficiently approximated in a gradient-based procedure.

discriminative grammar, grammar, subcategory, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Rapid Inference on a Novel AND/OR graph for Object Detection, Segmentation and Parsing

Chen, Yuanhao, Zhu, Long, Lin, Chenxi, Zhang, Hongjiang, Yuille, Alan L.

Neural Information Processing SystemsDec-31-2008

In this paper we formulate a novel AND/OR graph representation capable of describing the different configurations of deformable articulated objects such as horses. The representation makes use of the summarization principle so that lower level nodes in the graph only pass on summary statistics to the higher level nodes. The probability distributions are invariant to position, orientation, and scale. We develop a novel inference algorithm that combined a bottom-up process for proposing configurations for horses together with a top-down process for refining and validating these proposals. The strategy of surround suppression is applied to ensure that the inference time is polynomial in the size of input data. The algorithm was applied to the tasks of detecting, segmenting and parsing horses. We demonstrate that the algorithm is fast and comparable with the state of the art approaches.

configuration, node, proposal, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Middle East > Jordan (0.04)
Asia > China (0.04)

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Bayesian LDA-based model for semi-supervised part-of-speech tagging

Toutanova, Kristina, Johnson, Mark

Neural Information Processing SystemsDec-31-2008

We present a novel Bayesian model for semi-supervised part-of-speech tagging. Our model extends the Latent Dirichlet Allocation model and incorporates the intuition that words' distributions over tags, p(t w), are sparse. In addition we introduce a model for determining the set of possible tags of a word which captures important dependencies in the ambiguity classes of words. Our model outperforms the best previously proposed model for this task on a standard dataset.

ambiguity class, ambiguity class model, word type, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Redmond (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Discriminative Log-Linear Grammars with Latent Variables

Petrov, Slav, Klein, Dan

Neural Information Processing SystemsDec-31-2008

We demonstrate that log-linear grammars with latent variables can be practically trained using discriminative methods. Central to efficient discriminative training is a hierarchical pruning procedure which allows feature expectations to be efficiently approximated in a gradient-based procedure.

discriminative grammar, grammar, subcategory, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Rapid Inference on a Novel AND/OR graph for Object Detection, Segmentation and Parsing

Chen, Yuanhao, Zhu, Long, Lin, Chenxi, Zhang, Hongjiang, Yuille, Alan L.

Neural Information Processing SystemsDec-31-2008

In this paper we formulate a novel AND/OR graph representation capable of describing the different configurations of deformable articulated objects such as horses. The representation makes use of the summarization principle so that lower level nodes in the graph only pass on summary statistics to the higher level nodes. The probability distributions are invariant to position, orientation, and scale. We develop a novel inference algorithm that combined a bottom-up process for proposing configurations for horses together with a top-down process for refining and validating these proposals. The strategy of surround suppression is applied to ensure that the inference time is polynomial in the size of input data. The algorithm was applied to the tasks of detecting, segmenting and parsing horses. We demonstrate that the algorithm is fast and comparable with the state of the art approaches.

configuration, node, proposal, (14 more...)

Neural Information Processing Systems

Country: