Perceptrons
A Network of Localized Linear Discriminants
The localized linear discriminant network (LLDN) has been designed to address classification problems containing relatively closely spaced data from different classes (encounter zones [1], the accuracy problem [2]). Locally trained hyperplane segments are an effective way to define the decision boundaries for these regions [3]. The LLD uses a modified perceptron training algorithm for effective discovery of separating hyperplane/sigmoid units within narrow boundaries. The basic unit of the network is the discriminant receptive field (DRF) which combines the LLD function with Gaussians representing the dispersion of the local training data with respect to the hyperplane. The DRF implements a local distance measure [4], and obtains the benefits of networks oflocalized units [5]. A constructive algorithm for the two-class case is described which incorporates DRF's into the hidden layer to solve local discrimination problems. The output unit produces a smoothed, piecewise linear decision boundary. Preliminary results indicate the ability of the LLDN to efficiently achieve separation when boundaries are narrow and complex, in cases where both the "standard" multilayer perceptron (MLP) and k-nearest neighbor (KNN) yield high error rates on training data. 1 The LLD Training Algorithm and DRF Generation The LLD is defined by the hyperplane normal vector V and its "midpoint" M (a translated origin [1] near the center of gravity of the training data in feature space).
Principled Architecture Selection for Neural Networks: Application to Corporate Bond Rating Prediction
The notion of generalization ability can be defined precisely as the prediction risk, the expected performance of an estimator in predicting new observations. In this paper, we propose the prediction risk as a measure of the generalization ability of multi-layer perceptron networks and use it to select an optimal network architecture from a set of possible architectures. We also propose a heuristic search strategy to explore the space of possible architectures. The prediction risk is estimated from the available data; here we estimate the prediction risk by v-fold cross-validation and by asymptotic approximations of generalized cross-validation or Akaike's final prediction error. We apply the technique to the problem of predicting corporate bond ratings. This problem is very attractive as a case study, since it is characterized by the limited availability of the data and by the lack of a complete a priori model which could be used to impose a structure to the network architecture.
A Network of Localized Linear Discriminants
The localized linear discriminant network (LLDN) has been designed to address classification problems containing relatively closely spaced data from different classes (encounter zones [1], the accuracy problem [2]). Locally trained hyperplane segments are an effective way to define the decision boundaries for these regions [3]. The LLD uses a modified perceptron training algorithm for effective discovery of separating hyperplane/sigmoid units within narrow boundaries. The basic unit of the network is the discriminant receptive field (DRF) which combines the LLD function with Gaussians representing the dispersion of the local training data with respect to the hyperplane. The DRF implements a local distance measure [4], and obtains the benefits of networks oflocalized units [5]. A constructive algorithm for the two-class case is described which incorporates DRF's into the hidden layer to solve local discrimination problems. The output unit produces a smoothed, piecewise linear decision boundary. Preliminary results indicate the ability of the LLDN to efficiently achieve separation when boundaries are narrow and complex, in cases where both the "standard" multilayer perceptron (MLP) and k-nearest neighbor (KNN) yield high error rates on training data. 1 The LLD Training Algorithm and DRF Generation The LLD is defined by the hyperplane normal vector V and its "midpoint" M (a translated origin [1] near the center of gravity of the training data in feature space).
Principled Architecture Selection for Neural Networks: Application to Corporate Bond Rating Prediction
The notion of generalization ability can be defined precisely as the prediction risk, the expected performance of an estimator in predicting new observations. In this paper, we propose the prediction risk as a measure of the generalization ability of multi-layer perceptron networks and use it to select an optimal network architecture from a set of possible architectures. We also propose a heuristic search strategy to explore the space of possible architectures. The prediction risk is estimated from the available data; here we estimate the prediction risk by v-fold cross-validation and by asymptotic approximations of generalized cross-validation or Akaike's final prediction error. We apply the technique to the problem of predicting corporate bond ratings. This problem is very attractive as a case study, since it is characterized by the limited availability of the data and by the lack of a complete a priori model which could be used to impose a structure to the network architecture.
A Network of Localized Linear Discriminants
The localized linear discriminant network (LLDN) has been designed to address classification problems containing relatively closely spaced data from different classes (encounter zones [1], the accuracy problem [2]). Locally trained hyperplane segmentsare an effective way to define the decision boundaries for these regions [3]. The LLD uses a modified perceptron training algorithm for effective discovery of separating hyperplane/sigmoid units within narrow boundaries. The basic unit of the network is the discriminant receptive field (DRF) which combines the LLD function with Gaussians representing the dispersion of the local training data with respect to the hyperplane. The DRF implements a local distance measure [4],and obtains the benefits of networks oflocalized units [5]. A constructive algorithm for the two-class case is described which incorporates DRF's into the hidden layer to solve local discrimination problems. The output unit produces a smoothed, piecewise linear decision boundary. Preliminary results indicate the ability of the LLDN to efficiently achieve separation when boundaries are narrow and complex, in cases where both the "standard" multilayer perceptron (MLP) and k-nearest neighbor (KNN) yield high error rates on training data. 1 The LLD Training Algorithm and DRF Generation The LLD is defined by the hyperplane normal vector V and its "midpoint" M (a translated origin [1] near the center of gravity of the training data in feature space). Incremental corrections to V and M accrue for each training token feature vector Yj in the training set, as iIlustrated in figure 1 (exaggerated magnitudes).
Principled Architecture Selection for Neural Networks: Application to Corporate Bond Rating Prediction
The notion of generalization ability can be defined precisely as the prediction risk,the expected performance of an estimator in predicting new observations. In this paper, we propose the prediction risk as a measure of the generalization ability of multi-layer perceptron networks and use it to select an optimal network architecture from a set of possible architectures. Wealso propose a heuristic search strategy to explore the space of possible architectures. The prediction risk is estimated from the available data; here we estimate the prediction risk by v-fold cross-validation and by asymptotic approximations of generalized cross-validation or Akaike's final prediction error. We apply the technique to the problem of predicting corporate bond ratings. This problem is very attractive as a case study, since it is characterized by the limited availability of the data and by the lack of a complete a priori model which could be used to impose a structure to the network architecture.
Neural Network Diagnosis of Avascular Necrosis from Magnetic Resonance Images
Manduca, Armando, Christy, Paul, Ehman, Richard
Armando Manduca Dept. of Physiology and Biophysics Mayo Clinic Rochester, MN 55905 PaulChristy Dept. of Diagnostic Radiology Mayo Clinic Rochester, MN 55905 Richard Ehman Dept. of Diagnostic Radiology Mayo Clinic Rochester, MN 55905 Abstract Avascular necrosis (AVN) of the femoral head is a common yet potentially seriousdisorder which can be detected in its very early stages with magnetic resonance imaging. We have developed multi-layer perceptron networks, trained with conjugate gradient optimization, which diagnose AVN from single magnetic resonance images of the femoral head with 100% accuracy on training data and 97% accuracy on test data. 1 INTRODUCTION Diagnostic radiology may be a very natural field of application for neural networks, since a simple answer is desired from a complex image, and the learning process that human experts undergo is to a large extent a supervised learning experience based on looking at large numbers of images with known interpretations. Although many workers have applied neural nets to various types of I-dimensional medical data (e.g. ECG and EEG waveforms), little work has been done on applying neural nets to diagnosis directly from medical images. We chose the diagnosis of avascular necrosis from magnetic resonance images as an ideal initial problem, because: the area in question is small and well-defined, its size and shape do not vary greatly between individuals, the condition (if present) is usually visible even at low spatial and gray level resolution on a single image, and real data is readily available.