Iterative Construction of Sparse Polynomial Approximations
Sanger, Terence D., Sutton, Richard S., Matheus, Christopher J.
We present an iterative algorithm for nonlinear regression based on construction ofsparse polynomials. Polynomials are built sequentially from lower to higher order. Selection of new terms is accomplished using a novel look-ahead approach that predicts whether a variable contributes to the remaining error. The algorithm is based on the tree-growing heuristic in LMS Trees which we have extended to approximation of arbitrary polynomials ofthe input features. In addition, we provide a new theoretical justification for this heuristic approach.
Learning Unambiguous Reduced Sequence Descriptions
Do you want your neural net algorithm to learn sequences? Do not limit yourselfto conventional gradient descent (or approximations thereof). Instead, use your sequence learning algorithm (any will do) to implement the following method for history compression. No matter what your final goalsare, train a network to predict its next input from the previous ones. Since only unpredictable inputs convey new information, ignore all predictable inputs but let all unexpected inputs (plus information about the time step at which they occurred) become inputs to a higher-level network of the same kind (working on a slower, self-adjusting time scale). Go on building a hierarchy of such networks.
Self-organization in real neurons: Anti-Hebb in 'Channel Space'?
Ion channels are the dynamical systems of the nervous system. Their distribution within the membrane governs not only communication of information betweenneurons, but also how that information is integrated within the cell. Here, an argument is presented for an'anti-Hebbian' rule for changing the distribution of voltage-dependent ion channels in order to flatten voltage curvatures in dendrites. Simulations show that this rule can account for the self-organisation of dynamical receptive field properties such as resonance and direction selectivity. It also creates the conditions for the faithful conduction within the cell of signals to which the cell has been exposed. Various possible cellular implementations of such a learning ruleare proposed, including activity-dependent migration of channel proteins in the plane of the membrane.
Constrained Optimization Applied to the Parameter Setting Problem for Analog Circuits
Kirk, David, Fleischer, Kurt, Watts, Lloyd, Barr, Alan
Alan Barr Computer Graphics 350-74 California Institute of Technology Pasadena, CA 91125 Abstract We use constrained optimization to select operating parameters for two circuits: a simple 3-transistor square root circuit, and an analog VLSI artificial cochlea. This automated method uses computer controlled measurement andtest equipment to choose chip parameters which minimize the difference between the actual circuit's behavior and a specified goal behavior. Choosing the proper circuit parameters is important to compensate formanufacturing deviations or adjust circuit performance within a certain range. As biologically-motivated analog VLSI circuits become increasingly complex, implying more parameters, setting these parameters by hand will become more cumbersome. Thus an automated parameter setting method can be of great value [Fleischer 90].
Repeat Until Bored: A Pattern Selection Strategy
An alternative to the typical technique of selecting training examples independently from a fixed distribution is fonnulated and analyzed, in which the current example is presented repeatedly until the error for that item is reduced to some criterion value,; then, another item is randomly selected.The convergence time can be dramatically increased or decreased by this heuristic, depending on the task, and is very sensitive to the value of .
HARMONET: A Neural Net for Harmonizing Chorales in the Style of J. S. Bach
Hild, Hermann, Feulner, Johannes, Menzel, Wolfram
After being trained on some dozen Bach chorales using error backpropagation, the system is capable of producing four-part chorales in the style of J .s.Bach, given a one-part melody. Our system solves a musical real-world problem on a performance level appropriate for musical practice. HARMONET's power is based on (a) a new coding scheme capturing musically relevant information and (b) the integration of backpropagation and symbolic algorithms in a hierarchical system, combining theadvantages of both. 1 INTRODUCTION Neural approaches to music processing have been previously proposed (Lischka, 1989) and implemented (Mozer, 1991)(Todd, 1989). The promise neural networks offer is that they may shed some light on an aspect of human creativity that doesn't seem to be describable in terms of symbols and rules. Ultimately what music is (or isn't) lies in the eye (or ear) of the beholder.
Incrementally Learning Time-varying Half-planes
Kuh, Anthony, Petsche, Thomas, Rivest, Ronald L.
For a dichotomy, concept drift means that the classification function changes over time. We want to extend the theoretical analyses of learning to include time-varying concepts; to explore the behavior of current learning algorithms in the face of concept drift; and to devise tracking algorithms to better handle concept drift. In this paper, we briefly describe our theoretical model and then present the results of simulations *kuh@wiliki.eng.hawaii.edu
Human and Machine 'Quick Modeling'
Bernasconi, Jakob, Gustafson, Karl
We present here an interesting experiment in'quick modeling' by humans, performed independently on small samples, in several languages and two continents, over the last three years. Comparisons to decision tree procedures andneural net processing are given. From these, we conjecture that human reasoning is better represented by the latter, but substantially different fromboth. Implications for the'strong convergence hypothesis' between neuralnetworks and machine learning are discussed, now expanded to include human reasoning comparisons. 1 INTRODUCTION Until recently the fields of symbolic and connectionist learning evolved separately. Suddenly in the last two years a significant number of papers comparing the two methodologies have appeared. A beginning synthesis of these two fields was forged at the NIPS '90 Workshop #5 last year (Pratt and Norton, 1990), where one may find a good bibliography of the recent work of Atlas, Dietterich, Omohundro, Sanger, Shavlik, Tsoi, Utgoff and others. It was at that NIPS '90 Workshop that we learned of these studies, most of which concentrate on performance comparisons of decision tree algorithms (such as ID3, CART) and neural net algorithms (such as Perceptrons, Backpropagation). Independently threeyears ago we had looked at Quinlan's ID3 scheme (Quinlan, 1984) and intuitively and rather instantly not agreeing with the generalization he obtains by ID3 from a sample of 8 items generalized to 12 items, we subjected this example to a variety of human experiments. We report our findings, as compared to the performance of ID3 and also to various neural net computations.
Node Splitting: A Constructive Algorithm for Feed-Forward Neural Networks
The small network forms an approximate model of a set of training data, and the split creates a larger more powerful network which is initialised with the approximate solution already found. The insufficiency of the smaller network in modelling the system which generated the data leads to oscillation in those hidden nodes whose weight vectors cover regions inthe input space where more detail is required in the model. These nodes are identified and split in two using principal component analysis, allowing the new nodes t.o cover the two main modes of each oscillating vector. Nodes are selected for splitting using principal component analysis on the oscillating weight vectors, or by examining the Hessian matrix of second derivatives of the network error with respect to the weight.s.