AITopics | visr

Collaborating Authors

visr

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Tackling the Zero-Shot Reinforcement Learning Loss Directly

Ollivier, Yann

arXiv.org Artificial IntelligenceFeb-15-2025

Zero-shot reinforcement learning (RL) methods aim at instantly producing a behavior for an RL task in a given environment, from a description of the reward function. These methods are usually tested by evaluating their average performance on a series of downstream tasks. Yet they cannot be trained directly for that objective, unless the distribution of downstream tasks is known. Existing approaches either use other learning criteria [BBQ+ 18, TRO23, TO21, HDB+ 19], or explicitly set a prior on downstream tasks, such as reward functions given by a random neural network [FPAL24]. Here we prove that the zero-shot RL loss can be optimized directly, for a range of non-informative priors such as white noise rewards, temporally smooth rewards, ``scattered'' sparse rewards, or a combination of those. Thus, it is possible to learn the optimal zero-shot features algorithmically, for a wide mixture of priors. Surprisingly, the white noise prior leads to an objective almost identical to the one in VISR [HDB+19], via a different approach. This shows that some seemingly arbitrary choices in VISR, such as Von Mises--Fisher distributions, do maximize downstream performance. This also suggests more efficient ways to tackle the VISR objective. Finally, we discuss some consequences and limitations of the zero-shot RL objective, such as its tendency to produce narrow optimal features if only using Gaussian dense reward priors.

large language model, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2502.10792

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Unsupervised Discovery of Continuous Skills on a Sphere

Imagawa, Takahisa, Hiraoka, Takuya, Tsuruoka, Yoshimasa

arXiv.org Artificial IntelligenceMay-25-2023

Recently, methods for learning diverse skills to generate various behaviors without external rewards have been actively studied as a form of unsupervised reinforcement learning. However, most of the existing methods learn a finite number of discrete skills, and thus the variety of behaviors that can be exhibited with the learned skills is limited. In this paper, we propose a novel method for learning potentially an infinite number of different skills, which is named discovery of continuous skills on a sphere (DISCS). In DISCS, skills are learned by maximizing mutual information between skills and states, and each skill corresponds to a continuous value on a sphere. Because the representations of skills in DISCS are continuous, infinitely diverse skills could be learned. We examine existing methods and DISCS in the MuJoCo Ant robot control environments and show that DISCS can learn much more diverse skills than the other methods.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2305.14377

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Fast Task Inference with Variational Intrinsic Successor Features

Hansen, Steven, Dabney, Will, Barreto, Andre, Van de Wiele, Tom, Warde-Farley, David, Mnih, Volodymyr

arXiv.org Artificial IntelligenceJun-12-2019

It has been established that diverse behaviors spanning the controllable subspace of an Markov decision process can be trained by rewarding a policy for being distinguishable from other policies \citep{gregor2016variational, eysenbach2018diversity, warde2018unsupervised}. However, one limitation of this formulation is generalizing behaviors beyond the finite set being explicitly learned, as is needed for use on subsequent tasks. Successor features \citep{dayan93improving, barreto2017successor} provide an appealing solution to this generalization problem, but require defining the reward function as linear in some grounded feature space. In this paper, we show that these two techniques can be combined, and that each method solves the other's primary limitation. To do so we introduce Variational Intrinsic Successor FeatuRes (VISR), a novel algorithm which learns controllable features that can be leveraged to provide enhanced generalization and fast task inference through the successor feature framework. We empirically validate VISR on the full Atari suite, in a novel setup wherein the rewards are only exposed briefly after a long unsupervised phase. Achieving human-level performance on 14 games and beating all baselines, we believe VISR represents a step towards agents that rapidly learn from limited feedback.

machine learning, reinforcement learning, visr, (14 more...)

arXiv.org Artificial Intelligence

1906.0503

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback