AITopics | plr

Collaborating Authors

plr

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

0e915db6326b6fb6a3c56546980a8c93-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 17:09:39 GMT

aired, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Education (0.94)
Leisure & Entertainment > Games (0.93)
Leisure & Entertainment > Sports > Motorsports > Formula One (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
Information Technology > Artificial Intelligence > Robots (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

6b7375226d4742ff910618a56ae72b7d-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 15:13:41 GMT

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

0e915db6326b6fb6a3c56546980a8c93-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 12:16:47 GMT

agent, aired, plr, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Education (0.94)
Leisure & Entertainment > Games (0.93)
Leisure & Entertainment > Sports > Motorsports > Formula One (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Replay-Guided Adversarial Environment Design

Neural Information Processing SystemsDec-23-2025, 18:34:30 GMT

Deep reinforcement learning (RL) agents may successfully generalize to new settings if trained on an appropriately diverse set of environment and task configurations. Unsupervised Environment Design (UED) is a promising self-supervised RL paradigm, wherein the free parameters of an underspecified environment are automatically adapted during training to the agent's capabilities, leading to the emergence of diverse training environments. Here, we cast Prioritized Level Replay (PLR), an empirically successful but theoretically unmotivated method that selectively samples randomly-generated training levels, as UED. We argue that by curating completely random levels, PLR, too, can generate novel and complex levels for effective training. This insight reveals a natural class of UED methods we call Dual Curriculum Design (DCD). Crucially, DCD includes both PLR and a popular UED algorithm, PAIRED, as special cases and inherits similar theoretical guarantees. This connection allows us to develop novel theory for PLR, providing a version with a robustness guarantee at Nash equilibria. Furthermore, our theory suggests a highly counterintuitive improvement to PLR: by stopping the agent from updating its policy on uncurated levels (training on less data), we can improve the convergence to Nash equilibria. Indeed, our experiments confirm that our new method, PLR$^{\perp}$, obtains better results on a suite of out-of-distribution, zero-shot transfer tasks, in addition to demonstrating that PLR$^{\perp}$ improves the performance of PAIRED, from which it inherited its theoretical framework.

name change, plr, replay-guided adversarial environment design, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)

Add feedback

TRACED: Transition-aware Regret Approximation with Co-learnability for Environment Design

Cho, Geonwoo, Im, Jaegyun, Lee, Jihwan, Yi, Hojun, Kim, Sejin, Kim, Sundong

arXiv.org Artificial IntelligenceDec-4-2025

Generalizing deep reinforcement learning agents to unseen environments remains a significant challenge. One promising solution is Unsupervised Environment Design (UED), a co-evolutionary framework in which a teacher adaptively generates tasks with high learning potential, while a student learns a robust policy from this evolving curriculum. Existing UED methods typically measure learning potential via regret, the gap between optimal and current performance, approximated solely by value-function loss. Building on these approaches, we introduce the transition-prediction error as an additional term in our regret approximation. To capture how training on one task affects performance on others, we further propose a lightweight metric called Co-Learnability. By combining these two measures, we present Transition-aware Regret Approximation with Co-learnability for Environment Design (TRACED). Empirical evaluations show that TRACED produces curricula that improve zero-shot generalization over strong baselines across multiple benchmarks. Ablation studies confirm that the transition-prediction error drives rapid complexity ramp-up and that Co-Learnability delivers additional gains when paired with the transition-prediction error. These results demonstrate how refined regret approximation and explicit modeling of task relationships can be leveraged for sample-efficient curriculum design in UED. Project Page: https://geonwoo.me/traced/

machine learning, natural language, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2506.19997

Genre: Research Report > New Finding (1.00)

Industry: Education > Curriculum (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

6b7375226d4742ff910618a56ae72b7d-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 05:09:08 GMT

Nevertheless, the following questions still remain very relevant: 1. Large LRs are preferred but how large are we talking about? 2. What are the key characteristics of the models trained with different LRs?

accuracy, fine-tuning, regime, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

22fb0cee7e1f3bde58293de743871417-Reviews.html

Neural Information Processing SystemsOct-3-2025, 07:39:07 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The authors consider associative learning in networks of spiking neurons, and argue that a form of STDP with postsynaptic hyper-polarization is equivalent to the perceptron learning algorithm. The basic form of STDP proposed by the authors relies on traces (similarly to Morrison, Diesmann & Gerstner, "Phenomenological models of synaptic plasticity based on spike timing", Biol Cybern, 2008, 98, 459-478, which should have been mentioned here), and allows for both potentiation and depression of the synapse. The authors then introduce the perceptron learning rule (PLR) for binary variables, in a form where the weighted sum of inputs is compared to a threshold in order to determine the update. As is well known, the PLR is a supervised learning algorithm requiring a target to be specified at the post-synaptic site.

learning, modification, spike, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Nevada (0.04)

Genre: Summary/Review (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.70)

Add feedback

Task Scheduling & Forgetting in Multi-Task Reinforcement Learning

Speckmann, Marc, Eimer, Theresa

arXiv.org Artificial IntelligenceMar-3-2025

Reinforcement learning (RL) agents can forget tasks they have previously been trained on. There is a rich body of work on such forgetting effects in humans. Therefore we look for commonalities in the forgetting behavior of humans and RL agents across tasks and test the viability of forgetting prevention measures from learning theory in RL. W e find that in many cases, RL agents exhibit forgetting curves similar to those of humans. Methods like Leitner or SuperMemo have been shown to be effective at counteracting human forgetting, but we demonstrate they do not transfer as well to RL. W e identify a likely cause: asymmetrical learning and retention patterns between tasks that cannot be captured by retention-based or performance-based curriculum strategies.

curriculum, evaluation reward, scheduling, (13 more...)

arXiv.org Artificial Intelligence

2503.01941

Country: Asia > Middle East > Jordan (0.05)

Genre: Research Report (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Where Do Large Learning Rates Lead Us?

Sadrtdinov, Ildus, Kodryan, Maxim, Pokonechny, Eduard, Lobacheva, Ekaterina, Vetrov, Dmitry

arXiv.org Machine LearningOct-29-2024

It is generally accepted that starting neural networks training with large learning rates (LRs) improves generalization. Following a line of research devoted to understanding this effect, we conduct an empirical study in a controlled setting focusing on two questions: 1) how large an initial LR is required for obtaining optimal quality, and 2) what are the key differences between models trained with different LRs? We discover that only a narrow range of initial LRs slightly above the convergence threshold lead to optimal results after fine-tuning with a small LR or weight averaging. By studying the local geometry of reached minima, we observe that using LRs from this optimal range allows for the optimization to locate a basin that only contains high-quality minima. Additionally, we show that these initial LRs result in a sparse set of learned features, with a clear focus on those most relevant for the task. In contrast, starting training with too small LRs leads to unstable minima and attempts to learn all features simultaneously, resulting in poor generalization. Conversely, using initial LRs that are too large fails to detect a basin with good solutions and extract meaningful patterns from the data.

accuracy, fine-tuning, regime, (14 more...)

arXiv.org Machine Learning

2410.22113

Country: