AITopics | Fairbank, Michael

Collaborating Authors

Fairbank, Michael

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimal resampling for the noisy OneMax problem

Liu, Jialin, Fairbank, Michael, Pérez-Liébana, Diego, Lucas, Simon M.

arXiv.org Artificial IntelligenceJun-12-2017

The OneMax problem is a standard benchmark optimisation problem for a binary search space. Recent work on applying a Bandit-Based Random Mutation Hill-Climbing algorithm to the noisy OneMax Problem showed that it is important to choose a good value for the resampling number to make a careful trade off between taking more samples in order to reduce noise, and taking fewer samples to reduce the total computational cost. This paper extends that observation, by deriving an analytical expression for the running time of the RMHC algorithm with resampling applied to the noisy OneMax problem, and showing both theoretically and empirically that the optimal resampling number increases with the number of dimensions in the search space.

artificial intelligence, evolutionary algorithm, onemax problem, (16 more...)

arXiv.org Artificial Intelligence

1607.06641

Country: Europe > United Kingdom (0.15)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

The Local Optimality of Reinforcement Learning by Value Gradients, and its Relationship to Policy Gradient Learning

Fairbank, Michael, Alonso, Eduardo

arXiv.org Artificial IntelligenceJan-2-2011

In this theoretical paper we are concerned with the problem of learning a value function by a smooth general function approximator, to solve a deterministic episodic control problem in a large continuous state space. It is shown that learning the gradient of the value-function at every point along a trajectory generated by a greedy policy is a sufficient condition for the trajectory to be locally extremal, and often locally optimal, and we argue that this brings greater efficiency to value-function learning. This contrasts to traditional value-function learning in which the value-function must be learnt over the whole of state space. It is also proven that policy-gradient learning applied to a greedy policy on a value-function produces a weight update equivalent to a value-gradient weight update, which provides a surprising connection between these two alternative paradigms of reinforcement learning, and a convergence proof for control problems with a value function represented by a general smooth function approximator.

artificial intelligence, reinforcement learning, trajectory, (17 more...)

arXiv.org Artificial Intelligence

1101.0428

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Reinforcement Learning by Value Gradients

Fairbank, Michael

arXiv.org Artificial IntelligenceMar-25-2008

The concept of the value-gradient is introduced and developed in the context of reinforcement learning. It is shown that by learning the value-gradients exploration or stochastic behaviour is no longer needed to find locally optimal trajectories. This is the main motivation for using value-gradients, and it is argued that learning value-gradients is the actual objective of any value-function learning algorithm for control problems. It is also argued that learning value-gradients is significantly more efficient than learning just the values, and this argument is supported in experiments by efficiency gains of several orders of magnitude, in several problem domains. Once value-gradients are introduced into learning, several analyses become possible. For example, a surprising equivalence between a value-gradient learning algorithm and a policy-gradient learning algorithm is proven, and this provides a robust convergence proof for control problems using a value function with a general function approximator.

artificial intelligence, neural network, trajectory, (18 more...)

arXiv.org Artificial Intelligence

0803.3539

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback