Goto

Collaborating Authors

 Natural Language


EL: A formal, yet natural, comprehensive knowledge representation

Hwang, C.H. | Schubert, L. K.

Classics

We describe a comprehensive framework for narrative understanding based on Episodic Logic (EL). This situational logic was developed and implemented as a semantic representation and commonsense knowledge representation that would serve the full range of interpretive and inferential needs of general NLU. The most distinctive feature of EL is its natural language-like expressiveness. It allows for generalized quantifiers, lambda abstraction, sentence and predicate modifiers, sentence and predicate reification, intensional predicates (corresponding to wanting, believing, making, etc.), unreliable generalizations, and perhaps most importantly, explicit situational variables (denoting episodes, events, states of affairs, etc.) linked to arbitrary formulas that describe them. These allow episodes to be explicitly related in terms of part-whole, temporal and causal relations. Episodic logical form is easily computed from surface syntax and lends itself to effective inference.




Automatically constructing a dictionary for information extraction tasks

Riloff, E.

Classics

Knowledge-based natural language processing systems have achieved good success with certain tasks but they are often criticized because they depend on a domain-specific dictionary that requires a great deal of manual knowledge engineering. This knowledge engineering bottleneck makes knowledge-based NLP systems impractical for real-world applications because they cannot be easily scaled up or ported to new domains. In response to this problem, we developed a system called AutoSlog that automatically builds a domain-specific dictionary of concepts for extracting information from text. Using AutoSlog, we constructed a dictionary for the domain of terrorist event descriptions in only 5 person-hours. We then compared the AutoSlog dictionary with a handcrafted dictionary that was built by two highly skilled graduate students and required approximately 1500 person-hours of effort. We evaluated the two dictionaries using two blind test sets of 100 texts each. Overall, the AutoSlog dictionary achieved 98% of the performance of the handcrafted dictionary. On the first test set, the Auto-Slog dictionary obtained 96.3% of the performance of the handcrafted dictionary. On the second test set, the overall scores were virtually indistinguishable with the AutoSlog dictionary achieving 99.7% of the performance of the handcrafted dictionary.


Parsing English with a link grammar

Sleator, D. | Temperley, D.

Classics

In just 3 minutes, help us better understand how you perceive arXiv. We gratefully acknowledge support from the Simons Foundation and member institutions.



Statistical Language Learning

Charniak, E.

Classics

Eugene Charniak breaks new ground in artificial intelligenceresearch by presenting statistical language processing from an artificial intelligence point of view in a text for researchers and scientists with a traditional computer science background.New, exacting empirical methods are needed to break the deadlock in such areas of artificial intelligence as robotics, knowledge representation, machine learning, machine translation, and natural language processing (NLP). It is time, Charniak observes, to switch paradigms. This text introduces statistical language processing techniques;word tagging, parsing with probabilistic context free grammars, grammar induction, syntactic disambiguation, semantic wordclasses, word-sense disambiguation;along with the underlying mathematics and chapter exercises.Charniak points out that as a method of attacking NLP problems, the statistical approach has several advantages. It is grounded in real text and therefore promises to produce usable results, and it offers an obvious way to approach learning: "one simply gathers statistics."Language,



Finite-state approximation of phrase structure grammars

Pereira, F. | Wright, R. N.

Classics

Phrase-structure grammars are an effective representation for important syntactic and semantic aspects of natural languages, but are computationally too demanding for use as language models in real-time speech recognition. An algorithm is described that computes finite-state approximations for context-free grammars and equivalent augmented phrase-structure grammar formalisms. The approximation is exact for certain context-free grammars generating regular languages, including all left-linear and right-linear context-free grammars. The algorithm has been used to construct finite-state language models for limited-domain speech recognition tasks.


Can logic programming execute as fast as imperative programming?

Van Roy, P. L.

Classics

The output is assembly code for the Berkeley Abstract Machine (BAM). Directives hold starting from the next predicate that is input. Clauses do not have to be contiguous in the input stream, however, the whole stream is read before compilation starts. This manual is organized into ten sections.