AITopics | sample inefficiency

Collaborating Authors

sample inefficiency

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Recurrent Hypernetworks are Surprisingly Strong in Meta-RL

Neural Information Processing SystemsDec-26-2025, 17:21:35 GMT

Deep reinforcement learning (RL) is notoriously impractical to deploy due to sample inefficiency. Meta-RL directly addresses this sample inefficiency by learning to perform few-shot learning when a distribution of related tasks is available for meta-training. While many specialized meta-RL methods have been proposed, recent work suggests that end-to-end learning in conjunction with an off-the-shelf sequential model, such as a recurrent network, is a surprisingly strong baseline. However, such claims have been controversial due to limited supporting evidence, particularly in the face of prior work establishing precisely the opposite. In this paper, we conduct an empirical investigation. While we likewise find that a recurrent network can achieve strong performance, we demonstrate that the use of hypernetworks is crucial to maximizing their potential. Surprisingly, when combined with hypernetworks, the recurrent baselines that are far simpler than existing specialized methods actually achieve the strongest performance of all methods evaluated. We provide code at https://github.com/jacooba/hyper.

electronic proceedings, name change, recurrent hypernetwork, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.61)

Add feedback

Recurrent Hypernetworks are Surprisingly Strong in Meta-RL

Neural Information Processing SystemsJan-19-2025, 21:34:52 GMT

meta-rl, recurrent hypernetwork, sample inefficiency, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.64)

Add feedback

Neural Sequence Model Training via $\alpha$-divergence Minimization

Koyamada, Sotetsu, Kikuchi, Yuta, Kanemura, Atsunori, Maeda, Shin-ichi, Ishii, Shin

arXiv.org Machine LearningJun-30-2017

We propose a new neural sequence model training method in which the objective function is defined by $\alpha$-divergence. We demonstrate that the objective function generalizes the maximum-likelihood (ML)-based and reinforcement learning (RL)-based objective functions as special cases (i.e., ML corresponds to $\alpha \to 0$ and RL to $\alpha \to1$). We also show that the gradient of the objective function can be considered a mixture of ML- and RL-based objective gradients. The experimental results of a machine translation task show that minimizing the objective function with $\alpha > 0$ outperforms $\alpha \to 0$, which corresponds to ML-based methods.

machine learning, natural language, objective function, (13 more...)

arXiv.org Machine Learning

1706.10031

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.15)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback