AITopics

Country: North America > United States (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)

Neural Information Processing SystemsDec-31-2000

Data Visualization and Feature Selection: New Algorithms for Nongaussian Data

Yang, Howard Hua, Moody, John

Visualization of input data and feature selection are intimately related. A good feature selection algorithm can identify meaningful coordinate projections for low dimensional data visualization. Conversely, a good visualization technique can suggest meaningful features to include in a model. Input variable selection is the most important step in the model selection process. Given a target variable, a set of input variables can be selected as explanatory variables by some prior knowledge.

artificial intelligence, mutual information, neural network, (14 more...)

Country: North America > United States (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Towards Faster Stochastic Gradient Search

Darken, Christian, Moody, John

Stochastic gradient descent is a general algorithm which includes LMS, online backpropagation, and adaptive k-means clustering as special cases.

artificial intelligence, converge, machine learning, (15 more...)

Country: North America > United States > California (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.95)

Towards Faster Stochastic Gradient Search

Darken, Christian, Moody, John

Stochastic gradient descent is a general algorithm which includes LMS, online backpropagation, and adaptive k-means clustering as special cases.

artificial intelligence, converge, machine learning, (15 more...)

Country: North America > United States > California (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.95)

Towards Faster Stochastic Gradient Search

Darken, Christian, Moody, John

Stochastic gradient descent is a general algorithm which includes LMS, online backpropagation, and adaptive k-means clustering as special cases.

artificial intelligence, converge, machine learning, (15 more...)

Country: North America > United States > California (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.95)

Networks with Learned Unit Response Functions

Moody, John, Yarvin, Norman

Feedforward networks composed of units which compute a sigmoidal function of a weighted sum of their inputs have been much investigated. We tested the approximation and estimation capabilities of networks using functions more complex than sigmoids. Three classes of functions were tested: polynomials, rational functions, and flexible Fourier series. Unlike sigmoids, these classes can fit nonmonotonic functions. They were compared on three problems: prediction of Boston housing prices, the sunspot count, and robot arm inverse dynamics. The complex units attained clearly superior performance on the robot arm problem, which is a highly nonmonotonic, pure approximation problem. On the noisy and only mildly nonlinear Boston housing and sunspot problems, differences among the complex units were revealed; polynomials did poorly, whereas rationals and flexible Fourier series were comparable to sigmoids. 1 Introduction

banking & finance, neural network, polynomial, (20 more...)

Industry: Banking & Finance > Real Estate (0.35)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Principled Architecture Selection for Neural Networks: Application to Corporate Bond Rating Prediction

Moody, John, Utans, Joachim

The notion of generalization ability can be defined precisely as the prediction risk, the expected performance of an estimator in predicting new observations. In this paper, we propose the prediction risk as a measure of the generalization ability of multi-layer perceptron networks and use it to select an optimal network architecture from a set of possible architectures. We also propose a heuristic search strategy to explore the space of possible architectures. The prediction risk is estimated from the available data; here we estimate the prediction risk by v-fold cross-validation and by asymptotic approximations of generalized cross-validation or Akaike's final prediction error. We apply the technique to the problem of predicting corporate bond ratings. This problem is very attractive as a case study, since it is characterized by the limited availability of the data and by the lack of a complete a priori model which could be used to impose a structure to the network architecture.

banking & finance, input variable, neural network, (13 more...)

Country: North America > United States > New York (0.14)

Industry: Banking & Finance > Credit (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Networks with Learned Unit Response Functions

Moody, John, Yarvin, Norman

Feedforward networks composed of units which compute a sigmoidal function ofa weighted sum of their inputs have been much investigated. We tested the approximation and estimation capabilities of networks using functions more complex than sigmoids. Three classes of functions were tested: polynomials, rational functions, and flexible Fourier series. Unlike sigmoids,these classes can fit nonmonotonic functions. They were compared on three problems: prediction of Boston housing prices, the sunspot count, and robot arm inverse dynamics. The complex units attained clearlysuperior performance on the robot arm problem, which is a highly nonmonotonic, pure approximation problem. On the noisy and only mildly nonlinear Boston housing and sunspot problems, differences among the complex units were revealed; polynomials did poorly, whereas rationals and flexible Fourier series were comparable to sigmoids. 1 Introduction

banking & finance, neural network, polynomial, (19 more...)

Industry: Banking & Finance > Real Estate (0.35)

Technology:

Information Technology > Artificial Intelligence > Robots (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Principled Architecture Selection for Neural Networks: Application to Corporate Bond Rating Prediction

Moody, John, Utans, Joachim

The notion of generalization ability can be defined precisely as the prediction risk,the expected performance of an estimator in predicting new observations. In this paper, we propose the prediction risk as a measure of the generalization ability of multi-layer perceptron networks and use it to select an optimal network architecture from a set of possible architectures. Wealso propose a heuristic search strategy to explore the space of possible architectures. The prediction risk is estimated from the available data; here we estimate the prediction risk by v-fold cross-validation and by asymptotic approximations of generalized cross-validation or Akaike's final prediction error. We apply the technique to the problem of predicting corporate bond ratings. This problem is very attractive as a case study, since it is characterized by the limited availability of the data and by the lack of a complete a priori model which could be used to impose a structure to the network architecture.

banking & finance, input variable, neural network, (13 more...)

Country: North America > United States > New York (0.14)

Industry: Banking & Finance > Credit (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Neural Information Processing SystemsDec-31-1989

Fast Learning in Multi-Resolution Hierarchies

Moody, John

A variety of approaches to adaptive information processing have been developed by workers in disparate disciplines. These include the large body of literature on approximation and interpolation techniques (curve and surface fitting), the linear, real-time adaptive signal processing systems (such as the adaptive linear combiner and the Kalman filter), and most recently, the reincarnation of nonlinear neural network models such as the multilayer perceptron. Each of these methods has its strengths and weaknesses. The curve and surface fitting techniques are excellent for off-line data analysis, but are typically not formulated with real-time applications in mind. The linear techniques of adaptive signal processing and adaptive control are well-characterized, but are limited to applications for which linear descriptions are appropriate. Finally, neural network learning models such as back propagation have proven extremely versatile at learning a wide variety of nonlinear mappings, but tend to be very slow computationally and are not yet well characterized.

artificial intelligence, hierarchy, neural network, (15 more...)

Country: North America > United States > New Mexico (0.15)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)