AITopics

Due to the simplicity and efficiency of its parameter estimation algorithm, the hidden Markov model (HMM) has emerged as one of the basic statistical tools for modeling discrete time series, finding widespread application in the areas of speech recognition (Rabinerand Juang, 1986) and computational molecular biology (Baldi et al., 1994). An HMM is essentially a mixture model, encoding information about the history of a time series in the value of a single multinomial variable (the hidden state). This multinomial assumption allows an efficient parameter estimation algorithm tobe derived (the Baum-Welch algorithm). However, it also severely limits the representational capacity of HMMs.

artificial intelligence, health & medicine, markov model, (17 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Williams, Christopher K. I., Rasmussen, Carl Edward

Gaussian Processes for Regression

The Bayesian analysis of neural networks is difficult because a simple priorover weights implies a complex prior distribution over functions. In this paper we investigate the use of Gaussian process priors over functions, which permit the predictive Bayesian analysis forfixed values of hyperparameters to be carried out exactly using matrix operations. Two methods, using optimization and averaging (viaHybrid Monte Carlo) over hyperparameters have been tested on a number of challenging problems and have produced excellent results. 1 INTRODUCTION In the Bayesian approach to neural networks a prior distribution over the weights induces a prior distribution over functions. This prior is combined with a noise model, which specifies the probability of observing the targets t given function values y, to yield a posterior over functions which can then be used for predictions. For neural networks the prior over functions has a complex form which means that implementations must either make approximations (e.g.

bayesian inference, gaussian process, neural network, (19 more...)

Country: North America > Canada > Ontario > Toronto (0.15)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Pedersen, Morten With, Hansen, Lars Kai, Larsen, Jan

Pruning with generalization based weight saliencies: λOBD, λOBS

The purpose of most architecture optimization schemes is to improve generalization.

artificial intelligence, neural network, test error, (18 more...)

Country:

Europe > Denmark (0.15)
North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.72)
Information Technology > Communications > Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Hofmann, Reimar, Tresp, Volker

Discovering Structure in Continuous Variables Using Bayesian Networks

We study Bayesian networks for continuous variables using nonlinear conditional density estimators. We demonstrate that useful structures can be extracted from a data set in a self-organized way and we present sampling techniques for belief update based on Markov blanket conditional density models.

artificial intelligence, bayesian inference, bayesian network, (15 more...)

Country: Europe > Germany (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Viola, Paul A., Schraudolph, Nicol N., Sejnowski, Terrence J.

Empirical Entropy Manipulation for Real-World Problems

No finite sample is sufficient to determine the density, and therefore the entropy, of a signal directly. Some assumption about either the functional form of the density or about its smoothness is necessary.

artificial intelligence, entropy, health & medicine, (18 more...)

Country: North America > United States > Massachusetts (0.14)

Industry: Health & Medicine (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Hihi, Salah El, Bengio, Yoshua

Hierarchical Recurrent Neural Networks for Long-Term Dependencies

Learning long-term dependencies is not as difficult with NARX recurrent neural networks.

deep learning, dependency, neural network, (17 more...)

Country: North America > United States (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Indiveri, Giacomo, Kramer, Jörg, Koch, Christof

Parallel analog VLSI architectures for computation of heading direction and time-to-contact

To exploit their properties at a system level, we developed parallel image processing architectures for applications that rely mostly on the qualitative properties of the optical flow, rather than on the precise values of the velocity vectors. Specifically, we designed two parallel architectures that employ arrays of elementary motion sensors for the computation of heading direction and time-to-contact. The application domain that we took into consideration for the implementation of such architectures, is the promising one of vehicle navigation. Having defined the types of images to be analyzed and the types of processing to perform, we were able to use a priori infor- VLSI Architectures for Computation of Heading Direction and Time-to-contact 721 mation to integrate selectively the sparse data obtained from the velocity sensors and determine the qualitative properties of the optical flow field of interest.

architecture, artificial intelligence, velocity sensor, (11 more...)

Country:

North America > United States > North Carolina (0.14)
North America > United States > California (0.14)

Industry: Semiconductors & Electronics (0.75)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.87)
Information Technology > Artificial Intelligence > Vision (0.60)

The Capacity of a Bump

Flake, Gary William

Recently, several researchers have reported encouraging experimental results when using Gaussian or bump-like activation functions in multilayer perceptrons. Networks of this type usually require fewer hidden layers and units and often learn much faster than typical sigmoidal networks. To explain these results we consider a hyper-ridge network, which is a simple perceptron with no hidden units and a rid¥e activation function. If we are interested in partitioningp points in d dimensions into two classes then in the limit as d approaches infinity the capacity of a hyper-ridge and a perceptron is identical.

artificial intelligence, neural network, perceptron, (18 more...)

Country: North America > United States > Maryland > Prince George's County > College Park (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Rehfuss, Steven, Hammerstrom, Dan W.

Model Matching and SFMD Computation

In systems that process sensory data there is frequently a model matching stage where class hypotheses are combined to recognize a complex entity. We introduce a new model of parallelism, the Single Function Multiple Data (SFMD) model, appropriate to this stage. SFMD functionality can be added with small hardware expense to certain existing SIMD architectures, and as an incremental addition to the programming model. Adding SFMD to an SIMD machine will not only allow faster model matching, but also increase its flexibility as a general purpose machine and its scope in performing the initial stages of sensory processing. 1 INTRODUCTION In systems that process sensory data there is frequently a post-classification stage where several independent class hypotheses are combined into the recognition of a more complex entity. Examples include matching word models with a string of observation probabilities, and matching visual object models with collections of edges or other features. Current parallel computer architectures for processing sensory data focus on the classification and pre-classification stages (Hammerstrom 1990).This is reasonable, as those stages likely have the largest potential for speedup through parallel execution. Nonetheless, the model-matching stage is also suitable for parallelism, as each model may be matched independently of the others. We introduce a new style of parallelism, Single Function Multiple Data (SFMD), that is suitable for the model-matching stage.

architecture, artificial intelligence, instruction, (14 more...)

Country: North America > United States > Oregon (0.14)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

Jackson, Jeffrey C., Craven, Mark

Learning Sparse Perceptrons

We introduce a new algorithm designed to learn sparse perceptrons over input representations which include high-order features. Our algorithm, which is based on a hypothesis-boosting method, is able to PAClearn a relatively natural class of target concepts. Moreover, the algorithm appears to work well in practice: on a set of three problem domains, the algorithm produces classifiers that utilize small numbers of features yet exhibit good generalization performance. Perhaps most importantly, our algorithm generates concept descriptions that are easy for humans to understand. However, in many applications, such as those that may involve scientific discovery, it is crucial to be able to explain predictions.

algorithm, health & medicine, neural network, (20 more...)