AITopics

Reward-modulated spike-timing-dependent plasticity (STDP) has recently emerged as a candidate for a learning rule that could explain how local learning rules at single synapses support behaviorally relevant adaptive changes in complex networksof spiking neurons. However the potential and limitations of this learning rule could so far only be tested through computer simulations. This article providestools for an analytic treatment of reward-modulated STDP, which allow us to predict under which conditions reward-modulated STDP will be able to achieve a desired learning effect. In particular, we can produce in this way a theoretical explanation and a computer model for a fundamental experimental finding on biofeedback in monkeys (reported in [1]).

artificial intelligence, machine learning, neuron, (18 more...)

Country: Europe > Austria (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Naish-guzman, Andrew, Holden, Sean

Robust Regression with Twinned Gaussian Processes

We propose a Gaussian process (GP) framework for robust inference in which a GP prior on the mixing weights of a two-component noise model augments the standard process over latent function values. This approach is a generalization of the mixture likelihood used in traditional robust GP regression, and a specialization of the GP mixture models suggested by Tresp (2000) and Rasmussen and Ghahramani (2002). The value of this restriction is in its tractable expectation propagation updates, which allow for faster inference and model selection, and better convergence than the standard mixture. An additional benefit over the latter method lies in our ability to incorporate knowledge of the noise domain to influence predictions, and to recover with the predictive distribution information about the outlier distribution via the gating process. The model has asymptotic complexity equal to that of conventional robust methods, but yields more confident predictions on benchmark problems than classical heavy-tailed models and exhibits improved stability for data with clustered corruptions, for which they fail altogether. We show further how our approach can be used without adjustment for more smoothly heteroscedastic data, and suggest how it could be extended to more general noise models. We also address similarities with the work of Goldberg et al. (1998), and the more recent contributions of Tresp, and Rasmussen and Ghahramani.

artificial intelligence, bayesian inference, gaussian process, (17 more...)

Country: Europe > United Kingdom (0.14)

Technology:

Information Technology > Modeling & Simulation (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Wang, Tao, Bowling, Michael, Schuurmans, Dale, Lizotte, Daniel J.

Stable Dual Dynamic Programming

Recently, we have introduced a novel approach to dynamic programming and reinforcement learningthat is based on maintaining explicit representations of stationary distributions instead of value functions. In this paper, we investigate the convergence properties of these dual algorithms both theoretically and empirically, and show how they can be scaled up by incorporating function approximation.

artificial intelligence, operator, optimization problem, (19 more...)

Country: North America > Canada > Alberta (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.73)

Lecchini-visintini, Andrea, Lygeros, John, Maciejowski, Jan

Simulated Annealing: Rigorous finite-time guarantees for optimization on continuous domains

Simulated annealing is a popular method for approaching the solution of a global optimization problem. Existing results on its performance apply to discrete combinatorial optimizationwhere the optimization variables can assume only a finite set of possible values. We introduce a new general formulation of simulated annealing whichallows one to guarantee finite-time performance in the optimization of functions of continuous variables. The results hold universally for any optimization problem on a bounded domain and establish a connection between simulated annealing and up-to-date theory of convergence of Markov chain Monte Carlo methods on continuous domains. Optimization is the general problem of finding a value of a vector of variables θ that maximizes (or minimizes) some scalar criterion U(θ).

global optimizer, health & medicine, optimization problem, (17 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.69)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Steck, Harald, Krishnapuram, Balaji, Dehing-oberije, Cary, Lambin, Philippe, Raykar, Vikas C.

On Ranking in Survival Analysis: Bounds on the Concordance Index

In this paper, we show that classical survival analysis involving censored data can naturally be cast as a ranking problem. The concordance index (CI), which quantifies the quality of rankings, is the standard performance measure for model \emph{assessment} in survival analysis. In contrast, the standard approach to \emph{learning} the popular proportional hazard (PH) model is based on Cox's partial likelihood. In this paper we devise two bounds on CI--one of which emerges directly from the properties of PH models--and optimize them \emph{directly}. Our experimental results suggest that both methods perform about equally well, with our new approach giving slightly better results than the Cox's method. We also explain why a method designed to maximize the Cox's partial likelihood also ends up (approximately) maximizing the CI.

concordance index, health & medicine, oncology, (17 more...)

Country: Europe > Netherlands (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

He, Jingrui, Carbonell, Jaime G.

Nearest-Neighbor-Based Active Learning for Rare Category Detection

Rare category detection is an open challenge for active learning, especially in the de-novo case (no labeled examples), but of significant practical importance for data mining - e.g.

artificial intelligence, machine learning, minority class, (17 more...)

Country: North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.42)

Jung, Kyomin, Shah, Devavrat

Local Algorithms for Approximate Inference in Minor-Excluded Graphs

We present a new local approximation algorithm for computing MAP and logpartition functionfor arbitrary exponential family distribution represented by a finite-valued pairwise Markov random field (MRF), say G. Our algorithm is based on decomposing G into appropriately chosen small components; computing estimates locally in each of these components and then producing a good global solution. We prove that the algorithm can provide approximate solution within arbitrary accuracy when G excludes some finite sized graph as its minor and G has bounded degree: all Planar graphs with bounded degree are examples of such graphs. The running time of the algorithm is Θ(n) (n is the number of nodes in G), with constant dependent on accuracy, degree of graph and size of the graph that is excluded as a minor (constant for Planar graphs). Our algorithm for minor-excluded graphs uses the decomposition scheme of Klein, Plotkin and Rao (1993). In general, our algorithm works with any decomposition schemeand provides quantifiable approximation guarantee that depends on the decomposition scheme.

algorithm, artificial intelligence, graph, (14 more...)

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.50)

Antos, András, Szepesvári, Csaba, Munos, Rémi

Fitted Q-iteration in continuous action-space MDPs

We consider continuous state, continuous action batch reinforcement learning where the goal is to learn a good policy from a sufficiently rich trajectory generated by another policy. We study a variant of fitted Q-iteration, where the greedy action selection is replaced by searching for a policy in a restricted set of candidate policies by maximizing the average action values. We provide a rigorous theoretical analysis of this algorithm, proving what we believe is the first finite-time bounds for value-function based algorithms for continuous state- and action-space problems.

algorithm, artificial intelligence, reinforcement learning, (14 more...)

Country:

Europe (0.47)
North America > Canada > Alberta (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Syed, Umar, Schapire, Robert E.

A Game-Theoretic Approach to Apprenticeship Learning

We study the problem of an apprentice learning to behave in an environment with an unknown reward function by observing the behavior of an expert. We follow on the work of Abbeel and Ng [1] who considered a framework in which the true reward function is assumed to be a linear combination of a set of known and observable features.We give a new algorithm that, like theirs, is guaranteed to learn a policy that is nearly as good as the expert's, given enough examples. However, unlike their algorithm, we show that ours may produce a policy that is substantially better than the expert's. Moreover, our algorithm is computationally faster, is easier toimplement, and can be applied even in the absence of an expert. The method is based on a game-theoretic view of the problem, which leads naturally to a direct application of the multiplicative-weights algorithm of Freund and Schapire [2] for playing repeated matrix games. In addition to our formal presentation and analysis of the new algorithm, we sketch how the method can be applied when the transition functionitself is unknown, and we provide an experimental demonstration of the algorithm on a toy video-game environment.

algorithm, artificial intelligence, computer game, (18 more...)

Country: North America > United States (0.14)

Industry: Leisure & Entertainment > Games > Computer Games (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.84)