Classics


The Computational Linguistics of Biological Sequences

Classics (Collection 2)

Shortly after Watson and Crick's discovery of the structure of DNA, and at about the same time that the genetic code and the essential facts of gene expression were being elucidated, the field of linguistics was being similarly revolutionized by the work of Noam Chomsky [Chomsky, 1955, 1957, 1959, 1963, 1965]. Observing that a seemingly infinite variety of language was available to individual human beings based on clearly finite resources and experience, he proposed a formal representation of the rules or syntax of language, called generative grammar, that could provide finite--indeed, concise--characterizations of such infinite languages. Just as the breakthroughs in molecular biology in that era served to anchor genetic concepts in physical structures and opened up entirely novel experimental paradigms, so did Chomsky's insight serve to energize the field of linguistics, with putative correlates of cognitive processes that could for the first time be reasoned about 48 ARTIFICIAL INTELLIGENCE & MOLECULAR BIOLOGY While Chomsky and his followers built extensively upon this foundation in the field of linguistics, generative grammars were also soon integrated into the framework of the theory of computation, and in addition now form the basis for efforts of computational linguists to automate the processing and understanding of human language. Since it is quite commonly asserted that DNA is a richly-expressive language for specifying the structures and processes of life, also with the potential for a seemingly infinite variety, it is surprising that relatively little has been done to apply to biological sequences the extensive results and methods developed over the intervening decades in the field of formal language theory. While such an approach has been proposed [Brendel and Busse, 1984], most investigations along these lines have used grammar formalisms as tools for what are essentially information-theoretic studies [Ebeling and Jimenez-Montano, 1980; Jimenez-Montano, 1984], or have involved statistical analyses at the level of vocabularies (reflecting a more traditional notion of comparative linguistics) [Brendel et al., 1986; Pevzner et al., 1989a,b; Pietrokovski et al., 1990].


And-or graphs, theorem-proving graphs, and bi-directional search

Classics

And-or graphs and theorem-proving graphs determine the same kind of search space and differ only in the direction of search: from axioms to goals, in the case of theorem-proving graphs, and in the opposite direction, from goals to axioms, in the case of and-or graphs. We investigate the construction of a single general algorithm which covers unidirectional search both for and-or graphs and for theorem-proving graphs, bidirectional search for path-finding problems and search for a simplest solution as well as search for any solution. Indeed, many different search spaces can represent the same original problem, as in the case of resolution systems where different refinements of the resolution rule determine different search spaces for a single problem of demonstrating the unsatisfiability of a given set of clauses. In the tree representation of theorem-proving graphs (used in Machine Intelligence 5 (Kowalski 1969)), identical problems Tare generated at different times, working forwards from the initial set of axioms {D, E, F, G}.





QA4: A procedural calculus for intuitive reasoning

Classics

Abstract: This report presents a language, called QA4, designed to facilitate the construction of problem-solving systems used for robot planning, theorem proving, and automatic program synthesis and verification. Thus it provides many useful programming aids. More importantly, however, it provides a semantic framework for common sense reasoning about these problem domains. The interpreter for the language is extraordinarily general, and is therefore an adaptable tool for developing the specialized techniques of intuitive, symbolic reasoning used by the intelligent systems.


Learning and executing generalized robot plans

Classics

"In this paper we describe some major new additions to the STRIPS robot problem-solving system. The first addition is a process for generalizing a plan produced by STRIPS so that problem-specific constants appearing in the plan are replaced by problem-independent parameters.The generalized plan, stored in a convenient format called a triangle table, has two important functions. The more obvious function is as a single macro action that can be used by STRIPS—either in whole or in part—during the solution of a subsequent problem. Perhaps less obviously, the generalized plan also plays a central part in the process that monitors the real-world execution of a plan, and allows the robot to react "intelligently" to unexpected consequences of actions.We conclude with a discussion of experiments with the system on several example problems."Artificial Intelligence 3:251-288


Generating Semantic Descriptions from Drawings of Scenes with Shadows

Classics

The research reported here concerns the principles used to automatically generate three-dimensional representations from line drawings of scenes. The computer programs involved look at scenes which consist of polyhedra and which may contain shadows and various kinds of coincidentally aligned scene features. Each generated description includes information about edge shape (convex, concave, occluding, shadow, etc.), about the type of illumination for each region (illuminated, projected shadow, or oriented away from the light source), and about the spacial orientation of regions. The methods used are based on the labeling schemes of Huffman and Clowes; this research provides a considerable extension to their work and also gives theoretical explanations to the heuristic scene analysis work of Guzman, Winston, and others. A condensed version appears in Patrick Winston (ed.), The Psychology of Computer Vision, pp. 19{91, New York: McGraw-Hill, 1975.Direct link to PDF.MIT AI Lab Technical Report No AITR-271, November 1


Understanding natural language

Classics

This paper describes a computer system for understanding English. It is based on the belief that in modeling language understanding, we must deal in an integrated way with all of the aspects of language--syntax, semantics, and inference. It enters into a dialog with a person, responding to English sentences with actions and English replies, asking for clarification when its heuristic programs cannot understand a sentence through the use of syntactic, semantic, contextual, and physical knowledge. By developing special procedural representations for syntax, semantics, and inference, we gain flexibility and power.


The sharing of structure in theorem-proving programs

Classics

We introduce the concept of the value of an expression in a binding environment which we use to standardize clauses apart and share the structure of parents in representing the resolvent. Lists provide the most obvious and natural representation of literals because lists perfectly reflect function nesting structure. Therefore, we introduce the concepts of an expression, a binding environment, and the value of an expression in a binding environment. If VALI and VA L2 have no common instance, then the call will return false.