Plotting

 Technology


Boosting with Multi-Way Branching in Decision Trees

Neural Information Processing Systems

It is known that decision tree learning can be viewed as a form of boosting. However, existing boosting theorems for decision tree learning allow only binary-branching trees and the generalization to multi-branching trees is not immediate. Practical decision tree algorithms, suchas CART and C4.5, implement a tradeoff between the number of branches and the improvement in tree quality as measured by an index function. Here we give a boosting justification fora particular quantitative tradeoff curve. Our main theorem states, in essence, that if we require an improvement proportional to the log of the number of branches then top-down greedy construction ofdecision trees remains an effective boosting algorithm.


Agglomerative Information Bottleneck

Neural Information Processing Systems

This question was recently shown in [9] to be a special case of a much more fundamental problem:What are the features of the variable X that are relevant for the prediction of another, relevance, variable Y?


Perceptual Organization Based on Temporal Dynamics

Neural Information Processing Systems

A figure-ground segregation network is proposed based on a novel boundary pair representation. Nodes in the network are boundary segmentsobtained through local grouping. Each node is excitatorily coupledwith the neighboring nodes that belong to the same region, and inhibitorily coupled with the corresponding paired node. Gestalt grouping rules are incorporated by modulating connections. Thestatus of a node represents its probability being figural and is updated according to a differential equation.


An Environment Model for Nonstationary Reinforcement Learning

Neural Information Processing Systems

Reinforcement learning in nonstationary environments is generally regarded as an important and yet difficult problem. This paper partially addresses the problem by formalizing a subclass of nonstationary environments.The environment model, called hidden-mode Markov decision process (HM-MDP), assumes that environmental changes are always confined to a small number of hidden modes.


Bayesian Averaging is Well-Temperated

Neural Information Processing Systems

Often a learning problem has natural quantitative measure of generalization. If a loss function is defined the natural measure is the generalization error, i.e., the expected loss on a random sample independent of the training set. Generalizability is a key topic of learning theory and much progress has been reported. Analytic results for a broad class of machines can be found in the litterature [8, 12, 9, 10] describing the asymptotic generalization ability of supervised algorithms that are continuously parameterized. Asymptotic bounds on generalization for general machines havebeen advocated by Vapnik [11]. Generalization results valid for finite training sets can only be obtained for specific learning machines, see e.g.


Bifurcation Analysis of a Silicon Neuron

Neural Information Processing Systems

We have developed a VLSI silicon neuron and a corresponding mathematical modelthat is a two state-variable system. We describe the circuit implementation and compare the behaviors observed in the silicon neuron and the mathematical model. We also perform bifurcation analysis ofthe mathematical model by varying the externally applied current and show that the behaviors exhibited by the silicon neuron under corresponding conditionsare in good agreement to those predicted by the bifurcation analysis.



Population Decoding Based on an Unfaithful Model

Neural Information Processing Systems

We study a population decoding paradigm in which the maximum likelihood inferenceis based on an unfaithful decoding model (UMLI). This is usually the case for neural population decoding because the encoding process of the brain is not exactly known, or because a simplified decoding modelis preferred for saving computational cost. We consider an unfaithful decoding model which neglects the pairwise correlation between neuronal activities, and prove that UMLI is asymptotically efficient whenthe neuronal correlation is uniform or of limited-range. The performance of UMLI is compared with that of the maximum likelihood inference based on a faithful model and that of the center of mass decoding method.It turns out that UMLI has advantages of decreasing the computational complexity remarkablely and maintaining a high-level decoding accuracy at the same time. The effect of correlation on the decoding accuracy is also discussed.


Local Probability Propagation for Factor Analysis

Neural Information Processing Systems

Ever since Pearl's probability propagation algorithm in graphs with cycles was shown to produce excellent results for error-correcting decoding a few years ago, we have been curious about whether local probability propagation could be used successfully for machine learning.One of the simplest adaptive models is the factor analyzer, which is a two-layer network that models bottom layer sensory inputs as a linear combination of top layer factors plus independent Gaussiansensor noise. We show that local probability propagation in the factor analyzer network usually takes just a few iterations to perform accurate inference, even in networks with 320 sensors and 80 factors. We derive an expression for the algorithm's fixed point and show that this fixed point matches the exact solution ina variety of networks, even when the fixed point is unstable.