AITopics

Country:

North America > United States > New York (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems

Moody, John E.

We present an analysis of how the generalization performance (expected test set error) relates to the expected training set error for nonlinear learning systems, such as multilayer perceptrons and radial basis functions.

akaike, effective number, peff, (13 more...)

Country:

North America > United States > New York (0.05)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
North America > United States > Connecticut > New Haven County > New Haven (0.04)
Europe > Hungary > Budapest > Budapest (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Moody, John, Utans, Joachim

Principled Architecture Selection for Neural Networks: Application to Corporate Bond Rating Prediction

The notion of generalization ability can be defined precisely as the prediction risk, the expected performance of an estimator in predicting new observations. In this paper, we propose the prediction risk as a measure of the generalization ability of multi-layer perceptron networks and use it to select an optimal network architecture from a set of possible architectures. We also propose a heuristic search strategy to explore the space of possible architectures. The prediction risk is estimated from the available data; here we estimate the prediction risk by v-fold cross-validation and by asymptotic approximations of generalized cross-validation or Akaike's final prediction error. We apply the technique to the problem of predicting corporate bond ratings. This problem is very attractive as a case study, since it is characterized by the limited availability of the data and by the lack of a complete a priori model which could be used to impose a structure to the network architecture.

architecture, input variable, principled architecture selection, (12 more...)

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > Connecticut > New Haven County > New Haven (0.05)

Industry: Banking & Finance > Credit (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

A Network of Localized Linear Discriminants

Glassman, Martin S.

The localized linear discriminant network (LLDN) has been designed to address classification problems containing relatively closely spaced data from different classes (encounter zones [1], the accuracy problem [2]). Locally trained hyperplane segments are an effective way to define the decision boundaries for these regions [3]. The LLD uses a modified perceptron training algorithm for effective discovery of separating hyperplane/sigmoid units within narrow boundaries. The basic unit of the network is the discriminant receptive field (DRF) which combines the LLD function with Gaussians representing the dispersion of the local training data with respect to the hyperplane. The DRF implements a local distance measure [4], and obtains the benefits of networks oflocalized units [5]. A constructive algorithm for the two-class case is described which incorporates DRF's into the hidden layer to solve local discrimination problems. The output unit produces a smoothed, piecewise linear decision boundary. Preliminary results indicate the ability of the LLDN to efficiently achieve separation when boundaries are narrow and complex, in cases where both the "standard" multilayer perceptron (MLP) and k-nearest neighbor (KNN) yield high error rates on training data. 1 The LLD Training Algorithm and DRF Generation The LLD is defined by the hyperplane normal vector V and its "midpoint" M (a translated origin [1] near the center of gravity of the training data in feature space).

boundary, dispersion, drf, (14 more...)

Country:

North America > United States > New York (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems

Moody, John E.

We present an analysis of how the generalization performance (expected test set error) relates to the expected training set error for nonlinear learning systems, such as multilayer perceptrons and radial basis functions.

akaike, effective number, peff, (13 more...)

Country:

North America > United States > New York (0.05)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
North America > United States > Connecticut > New Haven County > New Haven (0.04)
Europe > Hungary > Budapest > Budapest (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Moody, John, Utans, Joachim

Principled Architecture Selection for Neural Networks: Application to Corporate Bond Rating Prediction

The notion of generalization ability can be defined precisely as the prediction risk, the expected performance of an estimator in predicting new observations. In this paper, we propose the prediction risk as a measure of the generalization ability of multi-layer perceptron networks and use it to select an optimal network architecture from a set of possible architectures. We also propose a heuristic search strategy to explore the space of possible architectures. The prediction risk is estimated from the available data; here we estimate the prediction risk by v-fold cross-validation and by asymptotic approximations of generalized cross-validation or Akaike's final prediction error. We apply the technique to the problem of predicting corporate bond ratings. This problem is very attractive as a case study, since it is characterized by the limited availability of the data and by the lack of a complete a priori model which could be used to impose a structure to the network architecture.

architecture, input variable, principled architecture selection, (12 more...)

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > Connecticut > New Haven County > New Haven (0.05)

Industry: Banking & Finance > Credit (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

A Network of Localized Linear Discriminants

Glassman, Martin S.

The localized linear discriminant network (LLDN) has been designed to address classification problems containing relatively closely spaced data from different classes (encounter zones [1], the accuracy problem [2]). Locally trained hyperplane segmentsare an effective way to define the decision boundaries for these regions [3]. The LLD uses a modified perceptron training algorithm for effective discovery of separating hyperplane/sigmoid units within narrow boundaries. The basic unit of the network is the discriminant receptive field (DRF) which combines the LLD function with Gaussians representing the dispersion of the local training data with respect to the hyperplane. The DRF implements a local distance measure [4],and obtains the benefits of networks oflocalized units [5]. A constructive algorithm for the two-class case is described which incorporates DRF's into the hidden layer to solve local discrimination problems. The output unit produces a smoothed, piecewise linear decision boundary. Preliminary results indicate the ability of the LLDN to efficiently achieve separation when boundaries are narrow and complex, in cases where both the "standard" multilayer perceptron (MLP) and k-nearest neighbor (KNN) yield high error rates on training data. 1 The LLD Training Algorithm and DRF Generation The LLD is defined by the hyperplane normal vector V and its "midpoint" M (a translated origin [1] near the center of gravity of the training data in feature space). Incremental corrections to V and M accrue for each training token feature vector Yj in the training set, as iIlustrated in figure 1 (exaggerated magnitudes).

artificial intelligence, drf, machine learning, (16 more...)

Country: North America > United States (0.14)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Moody, John, Utans, Joachim

Principled Architecture Selection for Neural Networks: Application to Corporate Bond Rating Prediction

The notion of generalization ability can be defined precisely as the prediction risk,the expected performance of an estimator in predicting new observations. In this paper, we propose the prediction risk as a measure of the generalization ability of multi-layer perceptron networks and use it to select an optimal network architecture from a set of possible architectures. Wealso propose a heuristic search strategy to explore the space of possible architectures. The prediction risk is estimated from the available data; here we estimate the prediction risk by v-fold cross-validation and by asymptotic approximations of generalized cross-validation or Akaike's final prediction error. We apply the technique to the problem of predicting corporate bond ratings. This problem is very attractive as a case study, since it is characterized by the limited availability of the data and by the lack of a complete a priori model which could be used to impose a structure to the network architecture.

architecture, artificial intelligence, machine learning, (14 more...)

Country: North America > United States > New York (0.14)

Industry: Banking & Finance > Credit (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems

Moody, John E.

We present an analysis of how the generalization performance (expected test set error) relates to the expected training set error for nonlinear learning systems,such as multilayer perceptrons and radial basis functions.

artificial intelligence, effective number, machine learning, (15 more...)

Country: North America > United States (0.29)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Manduca, Armando, Christy, Paul, Ehman, Richard

Neural Network Diagnosis of Avascular Necrosis from Magnetic Resonance Images

Armando Manduca Dept. of Physiology and Biophysics Mayo Clinic Rochester, MN 55905 PaulChristy Dept. of Diagnostic Radiology Mayo Clinic Rochester, MN 55905 Richard Ehman Dept. of Diagnostic Radiology Mayo Clinic Rochester, MN 55905 Abstract Avascular necrosis (AVN) of the femoral head is a common yet potentially seriousdisorder which can be detected in its very early stages with magnetic resonance imaging. We have developed multi-layer perceptron networks, trained with conjugate gradient optimization, which diagnose AVN from single magnetic resonance images of the femoral head with 100% accuracy on training data and 97% accuracy on test data. 1 INTRODUCTION Diagnostic radiology may be a very natural field of application for neural networks, since a simple answer is desired from a complex image, and the learning process that human experts undergo is to a large extent a supervised learning experience based on looking at large numbers of images with known interpretations. Although many workers have applied neural nets to various types of I-dimensional medical data (e.g. ECG and EEG waveforms), little work has been done on applying neural nets to diagnosis directly from medical images. We chose the diagnosis of avascular necrosis from magnetic resonance images as an ideal initial problem, because: the area in question is small and well-defined, its size and shape do not vary greatly between individuals, the condition (if present) is usually visible even at low spatial and gray level resolution on a single image, and real data is readily available.

artificial intelligence, femoral head, machine learning, (13 more...)

Country: North America > United States > Minnesota > Olmsted County > Rochester (0.66)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)