Lifelong Credit Assignment with the Success-Story Algorithm

Schmidhuber, Juergen (The Swiss AI Lab IDSIA, University of Lugano, and SUPSI)

Aug-8-2011–AAAI Conferences

Consider an embedded agent with a self-modifying, Turing-equivalent policy that can change only through active self-modifications. How can we make sure that it learns to continually accelerate reward intake? Throughout its life the agent remains ready to undo any self-modification generated during any earlier point of its life, provided the reward per time since then has not increased, thus enforcing a lifelong success-story of self-modifications, each followed by long-term reward acceleration up to the present time. The stack-based method for enforcing this is called the success-story algorithm. It fully takes into account that early self-modifications set the stage for later ones (learning a learning algorithm), and automatically learns to extend self-evaluations until the collected reward statistics are reliable ... a very simple but general method waiting to be re-discovered! Time permitting, I will also briefly discuss more recent mathematically optimal universal maximizers of lifelong reward, in particular, the fully self-referential Goedel machine.

artificial intelligence, machine learning, schmidhuber, (11 more...)

AAAI Conferences

Aug-8-2011

Conferences PDF

Add feedback

Country:
- Europe (0.36)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Representation & Reasoning > Agents (0.37)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found