North America
Principles of Risk Minimization for Learning Theory
Learning is posed as a problem of function estimation, for which two principles ofsolution are considered: empirical risk minimization and structural risk minimization. These two principles are applied to two different statements ofthe function estimation problem: global and local. Systematic improvements in prediction power are illustrated in application to zip-code recognition.
A Segment-Based Automatic Language Identification System
Muthusamy, Yeshwant K., Cole, Ronald A.
Automatic language identification is the rapid automatic determination of the language beingspoken, by any speaker, saying anything. Despite several important applications of automatic language identification, this area has suffered from a lack of basic research and the absence of a standardized, public-domain database of languages. It is well known that languages have characteristic sound patterns. Languages have been described subjectively as "singsong", "rhythmic", "guttural", "nasal" etc. The key to solving the problem of automatic language identification is the detection and exploitation of such differences between languages. We assume that each language in the world has a unique acoustic structure, and that this structure can be defined in terms of phonetic and prosodic features of speech.
Some Approximation Properties of Projection Pursuit Learning Networks
Zhao, Ying, Atkeson, Christopher G.
Ying Zhao Christopher G. Atkeson The Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge, MA 02139 Abstract This paper will address an important question in machine learning: What kind of network architectures work better on what kind of problems? A projection pursuit learning network has a very similar structure to a one hidden layer sigmoidal neural network. A general method based on a continuous version of projection pursuit regression is developed to show that projection pursuit regression works better on angular smooth functions thanon Laplacian smooth functions. There exists a ridge function approximation scheme to avoid the curse of dimensionality for approximating functionsin L 2(¢d). 1 INTRODUCTION Projection pursuit is a nonparametric statistical technique to find "interesting" low dimensional projections of high dimensional data sets. It has been used for nonparametric fitting and other data-analytic purposes (Friedman and Stuetzle, 1981, Huber, 1985).
Information Measure Based Skeletonisation
Ramachandran, Sowmya, Pratt, Lorien Y.
Automatic determination of proper neural network topology by trimming oversized networks is an important area of study, which has previously been addressed using a variety of techniques. In this paper, we present Information Measure Based Skeletonisation (IMBS), a new approach to this problem where superfluous hidden units are removed based on their information measure (1M). This measure, borrowed from decision tree induction techniques,reflects the degree to which the hyperplane formed by a hidden unit discriminates between training data classes. We show the results of applying IMBS to three classification tasks and demonstrate that it removes a substantial number of hidden units without significantly affecting network performance.
Data Analysis using G/SPLINES
G/SPLINES is an algorithm for building functional models of data. It uses genetic search to discover combinations of basis functions which are then used to build a least-squares regression model. Because it produces a population of models which evolve over time rather than a single model, it allows analysis not possible with other regression-based approaches. 1 INTRODUCTION G/SPLINES is a hybrid of Friedman's Multivariable Adaptive Regression Splines (MARS) algorithm (Friedman, 1990) with Holland's Genetic Algorithm (Holland, 1975). G/SPLINES has advantages over MARS in that it requires fewer least-squares computations, is easily extendable to non-spline basis functions, may discover models inaccessible to local-variable selection algorithms, and allows significantly larger problems to be considered. These issues are discussed in (Rogers, 1991). This paper begins with a discussion of linear regression models, followed by a description of the G/SPLINES algorithm, and finishes with a series of experiments illustrating its performance, robustness, and analysis capabilities.
Reverse TDNN: An Architecture For Trajectory Generation
Trajectory generation finds interesting applications in the field of robotics, automation, filtering,or time series prediction. Neural networks, with their ability to learn from examples, have been proposed very early on for solving nonlinear control problems adaptively.Several neural net architectures have been proposed for trajectory generation, most notably recurrent networks, either with discrete time and externalloops (Jordan,1986), or with continuous time (Pearlmutter, 1988). Aside from being recurrent, these networks are not specifically tailored for trajectory generation. Ithas been shown that specific architectures, such as the Time Delay Neural Networks (Lang and Hinton, 1988), or convolutional networks in general, are better than fully connected networks at recognizing time sequences such as speech (Waibel et al., 1989), or pen trajectories (Guyon et al., 1991). We show that special architectures canalso be devised for trajectory generation, with dramatic performance improvement.
Fast, Robust Adaptive Control by Learning only Forward Models
A large class of motor control tasks requires that on each cycle the controller istold its current state and must choose an action to achieve a specified, state-dependent, goal behaviour. This paper argues that the optimization of learning rate, the number of experimental control decisions beforeadequate performance is obtained, and robustness is of prime importance-if necessary at the expense of computation per control cycle andmemory requirement. This is motivated by the observation that a robot which requires two thousand learning steps to achieve adequate performance, or a robot which occasionally gets stuck while learning, will always be undesirable, whereas moderate computational expense can be accommodated by increasingly powerful computer hardware. It is not unreasonable toassume the existence of inexpensive 100 Mflop controllers within a few years and so even processes with control cycles in the low tens of milliseconds will have millions of machine instructions in which to make their decisions. This paper outlines a learning control scheme which aims to make effective use of such computational power. 1 MEMORY BASED LEARNING Memory-based learning is an approach applicable to both classification and function learningin which all experiences presented to the learning box are explicitly remembered. The memory, Mem, is a set of input-output pairs, Mem {(Xl, YI), (X21 Y2), ..., (Xb Yk)}.