Industry
Combining Neural Networks and Context-Driven Search for Online, Printed Handwriting Recognition in the NEWTON
Yaeger, Larry S., Webb, Brandyn J., Lyon, Richard F.
While online handwriting recognition is an area of long-standing and ongoing research, the recent emergence of portable, pen-based computers has focused urgent attention on usable, practical solutions. We discuss a combination and improvement of classical methods to produce robust recognition of hand-printed English text for a recognizer shipping in new models of Apple Computer's NEWTON MESSAGEPAD and EMATE. Combining an artificial neural network (ANN) as a character classifier with a context-driven search over segmentation and word-recognition hypotheses provides an effective recognition system. Long-standing issues relative to training, generalization, segmentation, models of context, probabilistic formalisms, and so on, need to be resolved, however, to achieve excellent performance. We present a number of recent innovations in the application of ANNs as character classifiers for word recognition, including integrated multiple representations, normalized output error, negative training, stroke warping, frequency balancing, error emphasis, and quantized weights. User adaptation and extension to cursive recognition pose continuing challenges.
What Are Intelligence? And Why? 1996 AAAI Presidential Address
This article, derived from the 1996 Association for the Advancement of Artificial Intelligence Presidential Address, explores the notion of intelligence from a variety of perspectives and finds that it "are" many things. It has, for example, been interpreted in a variety of ways even within our own field, ranging from the logical view (intelligence as part of mathematical logic) to the psychological view (intelligence as an empirical phenomenon of the natural world) to a variety of others. One goal of this article is to go back to basics, reviewing the things that we, individually and collectively, have taken as given, in part because we have taken multiple different and sometimes inconsistent things for granted. I believe it will prove useful to expose the tacit assumptions, models, and metaphors that we carry around as a way of understanding both what we're about and why we sometimes seem to be at odds with one another. Intelligence are also many things in the sense that is a product of evolution. Our physical bodies are in many ways overdetermined, unnecessarily complex, and inefficiently designed, that is, the predictable product of the blind search that is evolution. What's manifestly true of our anatomy is also likely true of our cognitive architecture. Natural intelligence is unlikely to be limited by principles of parsimony and is likely to be overdetermined, unnecessarily complex, and inefficiently designed. In this sense, intelligence are many things because is composed of the many elements that have been thrown together over evolutionary timescales. I suggest that in the face of that, searching for minimalism and elegance may be a diversion, for it simply may not be there. Somewhat more crudely put: The human mind is a 400,000-year-old legacy application -- and you expected to find structured programming? I end with a number of speculations, suggesting that there are some niches in the design space of intelligences that are currently underexplored. One example is the view that thinking is in part visual, and hence it might prove useful to develop representations and reasoning mechanisms that reason with diagrams (not just about them) and that take seriously their visual nature. I speculate as well that thinking may be a form of reliving, that re-acting out what we have experienced is one powerful way to think about and solve problems in the world. In this view, thinking is not simply the decontextualized manipulation of abstract symbols, powerful though that may be. Instead, some significant part of our thinking may be the reuse or simulation of our experiences in the environment. In keeping with this, I suggest that it may prove useful to marry the concreteness of reasoning in a model with the power that arises from reasoning abstractly.
MITA: An Information-Extraction Approach to the Analysis of Free-Form Text in Life Insurance Applications
Glasgow, Barry, Mandell, Alan, Binney, Dan, Ghemri, Lila, Fisher, David
MetLife processes over 260,000 life insurance applications a year. Underwriting of these applications is labor intensive. Automation is difficult because the applications include many free-form text fields. MetLife's intelligent text analyzer (MITA) uses the information-extraction technique of natural language processing to structure the extensive textual fields on a life insurance application. Knowledge engineering, with the help of underwriters as domain experts, was performed to elicit significant concepts for both medical and occupational textual fields. A corpus of 20,000 life insurance applications provided the syntactical and semantic patterns in which these underwriting concepts occur. These patterns, in conjunction with the concepts, formed the frameworks for information extraction. Extension of the information-extraction work developed by Wendy Lehnert was used to populate these frameworks with classes obtained from the systematized nomenclature of human and veterinary medicine and the Dictionary of Occupational Titles ontologies. These structured frameworks can then be analyzed by conventional knowledge-based systems. MITA is currently processing 20,000 life insurance applications a month. Eighty-nine percent of the textual fields processed by MITA exceed the established confidence-level threshold and are potentially available for further analysis by domain-specific analyzers.
CREWS_NS: Scheduling Train Crews in The Netherlands
Morgado, Ernesto M., Martins, Joao P.
We present a system, CREWS_NS, that is used in the long-term scheduling of drivers and guards for the Dutch Railways. CREWS_NS schedules the work of about 5000 people. CREWS_NS is built on top of CREWS, a scheduling tool for speeding the development of scheduling applications. CREWS heavily relies on the use of AI techniques and has been built as a white-box system, in the sense that the planner can perceive what is going on, can interact with the system by proposing alternatives or querying decisions, and can adapt the behavior of the system to changing circumstances. Scheduling can be done in automatic, semiautomatic, or manual mode. CREWS has mechanisms for dealing with the constant changes that occur in input data, can identify the consequences of the change, and guides the planner in accommodating the changes in the already built schedules (rescheduling).
An Intelligent System for Case Review and Risk Assessment in Social Services
This article reports on the development and implementation of DISXPERT, an intelligent rule-based system tool for referral of social security disability recipients to vocational rehabilitation services. The growing use of paraprofessionals as caseworkers responsible for assessment in the social services area provides fertile domain areas for new and innovative application of intelligent system technology. The main function of DISXPERT is to provide support to paraprofessional caseworkers in reaching unbiased and consistent assessment decisions regarding referral of clients to vocational rehabilitation services. The results after four years of use demonstrate that paraprofessionals using DISXPERT can make assessments in less time and with a level of accuracy superior to the vocational rehabilitation domain professionals using manual methods. This article discusses the problem domain, the design and development of the system, uses of AI technology, payoffs, and deployment and maintenance of the system.
Tractability of Theory Patching
Argamon-Engelson, S., Koppel, M.
In this paper we consider the problem of `theory patching', in which we are given a domain theory, some of whose components are indicated to be possibly flawed, and a set of labeled training examples for the domain concept. The theory patching problem is to revise only the indicated components of the theory, such that the resulting theory correctly classifies all the training examples. Theory patching is thus a type of theory revision in which revisions are made to individual components of the theory. Our concern in this paper is to determine for which classes of logical domain theories the theory patching problem is tractable. We consider both propositional and first-order domain theories, and show that the theory patching problem is equivalent to that of determining what information contained in a theory is `stable' regardless of what revisions might be performed to the theory. We show that determining stability is tractable if the input theory satisfies two conditions: that revisions to each theory component have monotonic effects on the classification of examples, and that theory components act independently in the classification of examples in the theory. We also show how the concepts introduced can be used to determine the soundness and completeness of particular theory patching algorithms.
Statistical Mechanics of the Mixture of Experts
Kukjin Kang and Jong-Hoon Oh Department of Physics Pohang University of Science and Technology Hyoja San 31, Pohang, Kyongbuk 790-784, Korea Email: kkj.jhohOgalaxy.postech.ac.kr Abstract We study generalization capability of the mixture of experts learning fromexamples generated by another network with the same architecture. When the number of examples is smaller than a critical value,the network shows a symmetric phase where the role of the experts is not specialized. Upon crossing the critical point, the system undergoes a continuous phase transition to a symmetry breakingphase where the gating network partitions the input space effectively and each expert is assigned to an appropriate subspace. Wealso find that the mixture of experts with multiple level of hierarchy shows multiple phase transitions. 1 Introduction Recently there has been considerable interest among neural network community in techniques that integrate the collective predictions of a set of networks[l, 2, 3, 4]. The mixture of experts [1, 2] is a well known example which implements the philosophy ofdivide-and-conquer elegantly.
GTM: A Principled Alternative to the Self-Organizing Map
Bishop, Christopher M., Svensén, Markus, Williams, Christopher K. I.
The Self-Organizing Map (SOM) algorithm has been extensively studied and has been applied with considerable success to a wide variety of problems. However, the algorithm is derived from heuristic ideasand this leads to a number of significant limitations. In this paper, we consider the problem of modelling the probability densityof data in a space of several dimensions in terms of a smaller number of latent, or hidden, variables. We introduce a novel form of latent variable model, which we call the GTM algorithm (forGenerative Topographic Mapping), which allows general nonlinear transformations from latent space to data space, and which is trained using the EM (expectation-maximization) algorithm. Ourapproach overcomes the limitations of the SOM, while introducing no significant disadvantages. We demonstrate the performance ofthe GTM algorithm on simulated data from flow diagnostics for a multiphase oil pipeline.
GTM: A Principled Alternative to the Self-Organizing Map
Bishop, Christopher M., Svensén, Markus, Williams, Christopher K. I.
The Self-Organizing Map (SOM) algorithm has been extensively studied and has been applied with considerable success to a wide variety of problems. However, the algorithm is derived from heuristic ideas and this leads to a number of significant limitations. In this paper, we consider the problem of modelling the probability density of data in a space of several dimensions in terms of a smaller number of latent, or hidden, variables. We introduce a novel form of latent variable model, which we call the GTM algorithm (for Generative Topographic Mapping), which allows general nonlinear transformations from latent space to data space, and which is trained using the EM (expectation-maximization) algorithm. Our approach overcomes the limitations of the SOM, while introducing no significant disadvantages. We demonstrate the performance of the GTM algorithm on simulated data from flow diagnostics for a multiphase oil pipeline.