Pomerleau, Dean
Zero-shot Learning with Semantic Output Codes
Palatucci, Mark, Pomerleau, Dean, Hinton, Geoffrey E., Mitchell, Tom M.
We consider the problem of zero-shot learning, where the goal is to learn a classifier $f: X \rightarrow Y$ that must predict novel values of $Y$ that were omitted from the training set. To achieve this, we define the notion of a semantic output code classifier (SOC) which utilizes a knowledge base of semantic properties of $Y$ to extrapolate to novel classes. We provide a formalism for this type of classifier and study its theoretical properties in a PAC framework, showing conditions under which the classifier can accurately predict novel classes. As a case study, we build a SOC classifier for a neural decoding task and show that it can often predict words that people are thinking about from functional magnetic resonance images (fMRI) of their neural activity, even without training examples for those words.
Life in the Fast Lane: The Evolution of an Adaptive Vehicle Control System
Jochem, Todd, Pomerleau, Dean
Giving robots the ability to operate in the real world has been, and continues to be, one of the most difficult tasks in AI research. Their research has been focused on using adaptive, vision-based systems to increase the driving performance of the Navlab line of on-road mobile robots. This research has led to the development of a neural network system that can learn to drive on many road types simply by watching a human teacher. This article describes the evolution of this system from a research project in machine learning to a robust driving system capable of executing tactical driving maneuvers such as lane changing and intersection navigation.
Life in the Fast Lane: The Evolution of an Adaptive Vehicle Control System
Jochem, Todd, Pomerleau, Dean
Giving robots the ability to operate in the real world has been, and continues to be, one of the most difficult tasks in AI research. Since 1987, researchers at Carnegie Mellon University have been investigating one such task. Their research has been focused on using adaptive, vision-based systems to increase the driving performance of the Navlab line of on-road mobile robots. This research has led to the development of a neural network system that can learn to drive on many road types simply by watching a human teacher. This article describes the evolution of this system from a research project in machine learning to a robust driving system capable of executing tactical driving maneuvers such as lane changing and intersection navigation.
A Connectionist Technique for Accelerated Textual Input: Letting a Network Do the Typing
Pomerleau, Dean
Each year people spend a huge amount oftime typing. The text people type typically contains a tremendous amount of redundancy due to predictable word usage patterns and the text's structure. This paper describes a neural network system call AutoTypist that monitors a person's typing and predicts what will be entered next. AutoTypist displays the most likely subsequent word to the typist, who can accept it with a single keystroke, instead of typing it in its entirety. The multi-layer perceptron at the heart of Auto'JYpist adapts its predictions of likely subsequent text to the user's word usage pattern, and to the characteristics of the text currently being typed. Increases in typing speed of 2-3% when typing English prose and 10-20% when typing C code have been demonstrated using the system, suggesting a potential time savings of more than 20 hours per user per year. In addition to increasing typing speed, AutoTypist reduces the number of keystrokes a user must type by a similar amount (2-3% for English, 10-20% for computer programs). This keystroke savings has the potential to significantly reduce the frequency and severity of repeated stress injuries caused by typing, which are the most common injury suffered in today's office environment.
A Connectionist Technique for Accelerated Textual Input: Letting a Network Do the Typing
Pomerleau, Dean
Each year people spend a huge amount of time typing. The text people type typically contains a tremendous amount of redundancy due to predictable word usage patterns and the text's structure. This paper describes a neural network system call AutoTypist that monitors a person's typing and predicts what will be entered next. AutoTypist displays the most likely subsequent word to the typist, who can accept it with a single keystroke, instead of typing it in its entirety. The multi-layer perceptron at the heart of Auto'JYpist adapts its predictions of likely subsequent text to the user's word usage pattern, and to the characteristics of the text currently being typed. Increases in typing speed of 2-3% when typing English prose and 10-20% when typing C code have been demonstrated using the system, suggesting a potential time savings of more than 20 hours per user per year. In addition to increasing typing speed, AutoTypist reduces the number of keystrokes a user must type by a similar amount (2-3% for English, 10-20% for computer programs). This keystroke savings has the potential to significantly reduce the frequency and severity of repeated stress injuries caused by typing, which are the most common injury suffered in today's office environment.
Non-Intrusive Gaze Tracking Using Artificial Neural Networks
Baluja, Shumeet, Pomerleau, Dean
We have developed an artificial neural network based gaze tracking system which can be customized to individual users. Unlike other gaze trackers, which normally require the user to wear cumbersome headgear, or to use a chin rest to ensure head immobility, our system is entirely non-intrusive. Currently, the best intrusive gaze tracking systems are accurate to approximately 0.75degrees. In our experiments, we have been able to achieve an accuracy of 1.5 degrees, while allowing head mobility. In this paper we present an empirical analysis of the performance of a large number of artificial neuralnetwork architectures for this task.
Non-Intrusive Gaze Tracking Using Artificial Neural Networks
Baluja, Shumeet, Pomerleau, Dean
We have developed an artificial neural network based gaze tracking system which can be customized to individual users. Unlike other gaze trackers, which normally require the user to wear cumbersome headgear, or to use a chin rest to ensure head immobility, our system is entirely non-intrusive. Currently, the best intrusive gaze tracking systems are accurate to approximately 0.75 degrees. In our experiments, we have been able to achieve an accuracy of 1.5 degrees, while allowing head mobility. In this paper we present an empirical analysis of the performance of a large number of artificial neural network architectures for this task.
Neural Network Perception for Mobile Robot Guidance
Pomerleau, Dean
Vision based mobile robot guidance has proven difficult for classical machine vision methods because of the diversity and real time constraints inherent in the task. This thesis describes a connectionist system called ALVINN (Autonomous Land Vehicle In a Neural Network) that overcomes these difficulties. ALVINN learns to guide mobile robots using the back-propagation training algorithm. Because of its ability to learn from example, ALVINN can adapt to new situations and therefore cope with the diversity of the autonomous navigation task. But real world problems like vision based mobile robot guidance presents a different set of challenges for the connectionist paradigm.
Rapidly Adapting Artificial Neural Networks for Autonomous Navigation
Pomerleau, Dean
Dean A. Pomerleau School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Abstract The ALVINN (Autonomous Land Vehicle In a Neural Network) project addresses the problem of training artificial neural networks in real time to perform difficult perception tasks. ALVINN,is a back-propagation network that uses inputs from a video camera and an imaging laser rangefinder to drive the CMU Navlab, a modified Chevy van. This paper describes training techniques which allow ALVINN to learn in under 5 minutes to autonomously control the Navlab by watching a human driver's response to new situations. Using these techniques, ALVINN has been trained to drive in a variety of circumstances including single-lane paved and unpaved roads, multilane lined and unlined roads, and obstacle-ridden on-and off-road environments, at speeds of up to 20 miles per hour. 1 INTRODUCTION Previous trainable connectionist perception systems have often ignored important aspects of the form and content of available sensor data. Because of the assumed impracticality of training networks to perform realistic high level perception tasks, connectionist researchers have frequently restricted their task domains to either toy problems (e.g. the TC identification problem [11] [6]) or fixed low level operations (e.g.
Rapidly Adapting Artificial Neural Networks for Autonomous Navigation
Pomerleau, Dean
Dean A. Pomerleau School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Abstract The ALVINN (Autonomous Land Vehicle In a Neural Network) project addresses the problem of training artificial neural networks in real time to perform difficult perception tasks. ALVINN,is a back-propagation network that uses inputs from a video camera and an imaging laser rangefinder to drive the CMU Navlab, a modified Chevy van. This paper describes training techniques which allow ALVINN to learn in under 5 minutes to autonomously control the Navlab by watching a human driver's response to new situations. Using these techniques, ALVINN has been trained to drive in a variety of circumstances including single-lane paved and unpaved roads, multilane lined and unlined roads, and obstacle-ridden on-and off-road environments, at speeds of up to 20 miles per hour. 1 INTRODUCTION Previous trainable connectionist perception systems have often ignored important aspects of the form and content of available sensor data. Because of the assumed impracticality of training networks to perform realistic high level perception tasks, connectionist researchers have frequently restricted their task domains to either toy problems (e.g. the TC identification problem [11] [6]) or fixed low level operations (e.g.