Goto

Collaborating Authors

Learning in the Lexical-Grammatical Interface

AAAI Conferences

Children are facile at both discovering word boundaries and using those words to build higher-level structures in tandem. Current research treats lexical acquisition and grammar induction as two distinct tasks. Doing so has led to unreasonable assumptions. Existing work in grammar induction presupposes a perfectly segmented, noise-free lexicon, while lexical learning approaches largely ignore how the lexicon is used. This paper combines both tasks in a novel framework for bootstrapping lexical acquisition and grammar induction.


Meaning to Learn: Bootstrapping Semantics to Infer Syntax

AAAI Conferences

Context-free grammars cannot be identified in the limit from positive examples (Gold 1967), yet natural language grammars are more powerful than context-free grammars and humans learn them with remarkable ease from positive examples (Marcus 1993). Identifiability results for formal languages ignore a potentially powerful source of information available to learners of natural languages, namely, meanings. This paper explores the learnability of syntax (i.e.


On the Relationship Between Lexical Semantics and Syntax for the Inference of Context-Free Grammars

AAAI Conferences

Context-free grammars cannot be identified in the limit from positive examples (Gold 1967), yet natural language grammars are more powerful than context-free grammars and humans learn them with remarkable ease from positive examples (Marcus 1993). Identifiability results for formal languages ignore a potentially powerful source of information available to learners of natural languages, namely, meanings. This paper explores the learnability of syntax (i.e.


SS93-02-008.pdf

AAAI Conferences

Position paper: 'Grammatical semantics and multilinguality: what stands behind the lexicon?' John A. Bateman, Project KOMET, GMD/IPSI, Darmstadt, Germany There has in recent years been a steady increase in the role given to the lexicon in computational linguistics. Accordingly, there are now also many efforts to uncover appropriate organizations of lexical information: including proposals for taxonomies of semantic organizational primitives/features, 'ontology' design, etc. This very necessary activity seems to me, however, to be partly compromized by a second trend also resulting from the attention given to the lexicon. That is the move to lexicalize grammars so that the'grammatical' component becomes minimal and grammatical properties are'projected' from those of their lexical components. By reducing the role of graiumatical considerations, a strong source of information about useful lexical organization has been removed.


Identifying Hierarchical Structure in Sequences: A linear-time algorithm

Journal of Artificial Intelligence Research

SEQUITUR is an algorithm that infers a hierarchical structure from a sequence of discrete symbols by replacing repeated phrases with a grammatical rule that generates the phrase, and continuing this process recursively. The result is a hierarchical representation of the original sequence, which offers insights into its lexical structure. The algorithm is driven by two constraints that reduce the size of the grammar, and produce structure as a by-product. SEQUITUR breaks new ground by operating incrementally. Moreover, the method's simple structure permits a proof that it operates in space and time that is linear in the size of the input. Our implementation can process 50,000 symbols per second and has been applied to an extensive range of real world sequences.