Country
Norepinephrine and Neural Interrupts
Experimental data indicate that norepinephrine is critically involved in aspects of vigilance and attention. Previously, we considered the function ofthis neuromodulatory system on a time scale of minutes and longer, and suggested that it signals global uncertainty arising from gross changes in environmental contingencies. However, norepinephrine is also known to be activated phasically by familiar stimuli in welllearned tasks.Here, we extend our uncertainty-based treatment of norepinephrine tothis phasic mode, proposing that it is involved in the detection and reaction to state uncertainty within a task. This role of norepinephrine canbe understood through the metaphor of neural interrupts.
A Bayes Rule for Density Matrices
The classical Bayes rule computes the posterior model probability from the prior probability and the data likelihood. We generalize this rule to the case when the prior is a density matrix (symmetric positive definite and trace one) and the data likelihood a covariance matrix. The classical Bayes rule is retained as the special case when the matrices are diagonal. In the classical setting, the calculation of the probability of the data is an expected likelihood, where the expectation is over the prior distribution. In the generalized setting, this is replaced by an expected variance calculation where the variance is computed along the eigenvectors of the prior density matrix and the expectation is over the eigenvalues of the density matrix (which form a probability vector).The variances along any direction is determined by the covariance matrix. Curiously enough this expected variance calculationis a quantum measurement where the covariance matrix specifies the instrument and the prior density matrix the mixture state of the particle. We motivate both the classical and the generalized Bayes rule with a minimum relative entropy principle, wherethe Kullbach-Leibler version gives the classical Bayes rule and Umegaki's quantum relative entropy the new Bayes rule for density matrices.
An exploration-exploitation model based on norepinepherine and dopamine activity
McClure, Samuel M., Gilzenrat, Mark S., Cohen, Jonathan D.
We propose a model by which dopamine (DA) and norepinepherine (NE) combine to alternate behavior between relatively exploratory and exploitative modes. The model is developed for a target detection task for which there is extant single neuron recording data available from locus coeruleus (LC) NE neurons. An exploration-exploitation tradeoff is elicited by regularly switching which of the two stimuli are rewarded. DA functions within the model to change synaptic weights according to a reinforcement learning algorithm. Exploration is mediated by the state of LC firing, with higher tonic and lower phasic activity producing greater response variability. The opposite state of LC function, with lower baseline firing rate and greater phasic responses, favors exploitative behavior. Changes in LC firing mode result from combined measures of response conflict and reward rate, where response conflict is monitored using models of anterior cingulate cortex (ACC). Increased long-term response conflict and decreased reward rate, which occurs following reward contingency switch, favors the higher tonic state of LC function and NE release.
Convex Neural Networks
Bengio, Yoshua, Roux, Nicolas L., Vincent, Pascal, Delalleau, Olivier, Marcotte, Patrice
Convexity has recently received a lot of attention in the machine learning community, and the lack of convexity has been seen as a major disadvantage ofmany learning algorithms, such as multi-layer artificial neural networks. We show that training multi-layer neural networks in which the number of hidden units is learned can be viewed as a convex optimization problem. This problem involves an infinite number of variables, but can be solved by incrementally inserting a hidden unit at a time, each time finding a linear classifier that minimizes a weighted sum of errors.
Variable KD-Tree Algorithms for Spatial Pattern Search and Discovery
Kubica, Jeremy, Masiero, Joseph, Jedicke, Robert, Connolly, Andrew, Moore, Andrew W.
In this paper we consider the problem of finding sets of points that conform toa given underlying model from within a dense, noisy set of observations. Thisproblem is motivated by the task of efficiently linking faint asteroid detections, but is applicable to a range of spatial queries. We survey current tree-based approaches, showing a tradeoff exists between singletree and multiple tree algorithms. To this end, we present a new type of multiple tree algorithm that uses a variable number of trees to exploit the advantages of both approaches. We empirically show that this algorithm performs well using both simulated and astronomical data.
Data-Driven Online to Batch Conversions
Online learning algorithms are typically fast, memory efficient, and simple toimplement. However, many common learning problems fit more naturally in the batch learning setting. The power of online learning algorithms can be exploited in batch settings by using online-to-batch conversions techniques which build a new batch algorithm from an existing onlinealgorithm. We first give a unified overview of three existing online-to-batch conversion techniques which do not use training data in the conversion process. We then build upon these data-independent conversions to derive and analyze data-driven conversions.
Inference with Minimal Communication: a Decision-Theoretic Variational Approach
Kreidl, O. P., Willsky, Alan S.
Given a directed graphical model with binary-valued hidden nodes and real-valued noisy observations, consider deciding upon the maximum a-posteriori (MAP) or the maximum posterior-marginal (MPM) assignment underthe restriction that each node broadcasts only to its children exactly one single-bit message. We present a variational formulation, viewing the processing rules local to all nodes as degrees-of-freedom, that minimizes the loss in expected (MAP or MPM) performance subject to such online communication constraints. The approach leads to a novel message-passing algorithm to be executed offline, or before observations are realized, which mitigates the performance loss by iteratively coupling allrules in a manner implicitly driven by global statistics. We also provide (i) illustrative examples, (ii) assumptions that guarantee convergence andefficiency and (iii) connections to active research areas.