Grammars & Parsing
The Computational Linguistics of Biological Sequences
Shortly after Watson and Crick's discovery of the structure of DNA, and at about the same time that the genetic code and the essential facts of gene expression were being elucidated, the field of linguistics was being similarly revolutionized by the work of Noam Chomsky [Chomsky, 1955, 1957, 1959, 1963, 1965]. Observing that a seemingly infinite variety of language was available to individual human beings based on clearly finite resources and experience, he proposed a formal representation of the rules or syntax of language, called generative grammar, that could provide finite--indeed, concise--characterizations of such infinite languages. Just as the breakthroughs in molecular biology in that era served to anchor genetic concepts in physical structures and opened up entirely novel experimental paradigms, so did Chomsky's insight serve to energize the field of linguistics, with putative correlates of cognitive processes that could for the first time be reasoned about 48 A
INFERENTIAL MEMORY AS THE BASIS OF MACHINES WHICH UNDERSTAND NATURAL LANGUAGE
Participants in the search for intelligent machines frequently disagree on a basic question of strategy in their quest. On the one hand there are those who believe that the major obstacles can be overcome by reliance on the computer's infallible memory, electronic speed, and arithmetic capabilities uig This report takes the position that immediate, practical applica can derive from the former approach, but the major problems will be "\ To mention a single example, the implementation f information retrieval techniques on present-day computers would be a large step forward, even though the techniques thus far considered have largely been conceptually trivial. Luhn (1958) has u sed a straightforward statistical procedure to extract key sentences from scientific articles, thus yielding useful abstracts of a sort. For even an unintelligent human does more than count frequencies or search for key words. The human displays intelligent features which are generally summed up by saying that he ...
di, iii 1°° 11
The five-year ARPA-funded speech project that began at that time made understanding, rather than The transduction from speech to meaning must be mediated recognition, the primary research goal. It was felt that a by a variety of components that utilize diverse system's ability to respond intelligently to speech was a knowledge sources (KSs) because the speech signal encodes, more meaningful criterion for the evaluation of speech in a highly compressed and integrated fashion, systems. In addition, it was believed that the speech signal many different types of information relevant to the recovery was an impoverished source of information, and of meaning. This knowledge-based approach contrasts knowledge of the context of an utterance was essential for with that taken in whole-word template-matching systems; its successful recognition and interpretation. Speech-recognition variability in the pronunciation of words in connected systems based on dynamic programming, pattern-matching speech is no longer seen as a hindrance to pattern techniques have been developed for utterances matching but rather as an important source of information, that consist solely of isolated words chosen from a eg, concerning the location of word boundaries small vocabulary, and to a lesser extent, the same techniques (Church, 1983) or of contextually important (stressed) information have been extended to connected sequences of in the utterance. Figure 1 illustrates one possi-words Rabiner and Levinson (1981).
cowl '
Step 6 is a goal-assertion the input, another algorithm might result. Thus one could resolution that functions similarly to the goal-goal resolution break a into a[1],..., a [length(a)/2] and a [length(a)/ above. The final synthesized program is: 2 1],..., a[length(a)] and find an algorithm that recursively calls f on both the first and second halves of its f(x) if x NIL then 0 else car(x) f(cdr(x)).
Modeling a paranoid mind
Our descriptive vocabulary may still In this article I propose to describe an area of artificial contain proper names as modifiers but the explanatory intelligence (Al) research that I and several colleagues vocabulary now involves the impersonal qualities of an have been enaged in for a number of years.
Strategies for Understanding Structured English
Psychological work on memory, in particular by Bartlett (1932), has led the conclusion that people faced with a new situation use large amounts of highly structured knowledge acquired from previous experience. Bartlett used the word schema to refer to this phenomenon. Minsky (1975), his famous paper, proposed the notion of a frame as a fundamental structure used in natural language understanding, as well as in scene analysis. I will use the former term in the rest of this chapter, in spite of its general connotation. The main thesis defended by Bartlett was that the phenomena of memorization and remembering are both constructive and selective. The hypothesis has more recently been revived by psychologists working on discourse structure (Collins, 1978; Bransford and Franks, 1971; Kintsch, 1976). Various experiments performed on subjects who were told stories and then asked to describe what they remembered showed that people not only forget facts but add some. Moreover, they are unable to distinguish between what they have actually heard and what they have inferred. People hearing a story make assumptions, which they might revise or refine as more information comes in, either confirmatory or contradictory. Making such assumptions entails building (or retrieving) models of the expected text contents. A corollary of this process is that if the story adequately fits the model people have in mind, the story will be understood more easily. This chal)ter is based on a technical memo (HPP-79-25) from the Heuristic Programming lh( iect, l)cparmlent of Computer Science, Stanford University.