AITopics | drl

Collaborating Authors

drl

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning to Shape In-distribution Feature Space for Out-of-distribution Detection

Neural Information Processing SystemsDec-26-2025, 00:36:05 GMT

Out-of-distribution (OOD) detection is critical for deploying machine learning models in the open world. To design scoring functions that discern OOD data from the in-distribution (ID) cases from a pre-trained discriminative model, existing methods tend to make rigorous distributional assumptions either explicitly or implicitly due to the lack of knowledge about the learned feature space in advance. The mismatch between the learned and assumed distributions motivates us to raise a fundamental yet under-explored question: \textit{Is it possible to deterministically model the feature distribution while pre-training a discriminative model?}This paper gives an affirmative answer to this question by presenting a Distributional Representation Learning (\texttt{DRL}) framework for OOD detection. In particular, \texttt{DRL} explicitly enforces the underlying feature space to conform to a pre-defined mixture distribution, together with an online approximation of normalization constants to enable end-to-end training. Furthermore, we formulate \texttt{DRL} into a provably convergent Expectation-Maximization algorithm to avoid trivial solutions and rearrange the sequential sampling to guide the training consistency. Extensive evaluations across mainstream OOD detection benchmarks empirically manifest the superiority of the proposed \texttt{DRL} over its advanced counterparts.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

An Efficient Asynchronous Method for Integrating Evolutionary and Gradient-based Policy Search

Neural Information Processing SystemsDec-24-2025, 04:23:49 GMT

Deep reinforcement learning (DRL) algorithms and evolution strategies (ES) have been applied to various tasks, showing excellent performances. These have the opposite properties, with DRL having good sample efficiency and poor stability, while ES being vice versa. Recently, there have been attempts to combine these algorithms, but these methods fully rely on synchronous update scheme, making it not ideal to maximize the benefits of the parallelism in ES. To solve this challenge, asynchronous update scheme was introduced, which is capable of good time-efficiency and diverse policy exploration. In this paper, we introduce an Asynchronous Evolution Strategy-Reinforcement Learning (AES-RL) that maximizes the parallel efficiency of ES and integrates it with policy gradient methods. Specifically, we propose 1) a novel framework to merge ES and DRL asynchronously and 2) various asynchronous update methods that can take all advantages of asynchronism, ES, and DRL, which are exploration and time efficiency, stability, and sample efficiency, respectively. The proposed framework and update methods are evaluated in continuous control benchmark work, showing superior performance as well as time efficiency compared to the previous methods.

artificial intelligence, machine learning, reinforcement learning, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.83)

Add feedback

Robustness to Adversarial Perturbations in Learning from Incomplete Data

Amir Najafi, Shin-ichi Maeda, Masanori Koyama, Takeru Miyato

Neural Information Processing SystemsNov-16-2025, 18:45:10 GMT

We develop a generalization theory for our framework based on a number of novel complexity measures, such as an adversarial extension of Rademacher complexity and its semi-supervised analogue.

artificial intelligence, generalization, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > Canada (0.04)
Asia > Middle East > Iran > Tehran Province > Tehran (0.04)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

EcoLight: Intersection Control in Developing Regions Under Extreme Budget and Network Constraints

Neural Information Processing SystemsNov-14-2025, 15:17:14 GMT

Effective intersection control can play an important role in reducing traffic congestion and associated vehicular emissions.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

Asia > Pakistan (0.05)
Asia > India > NCT > New Delhi (0.04)
Asia > India > NCT > Delhi (0.04)
(4 more...)

Industry:

Transportation > Infrastructure & Services (0.47)
Transportation > Ground > Road (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Vision (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

reviewers, that we will make an implementation of our work available upon publication

Neural Information Processing SystemsNov-14-2025, 04:48:23 GMT

We are glad that our reviewers agree on the merits and relevance of our work. R3/R4: Applying Freeze-Thaw BO in the settings considered. See Figure 1 for further illustration of why FT struggles in DRL settings. Fabolas uses a different way of obtaining low-fidelity information. R3: Sec 3.2 and 3.3 should be reversed as Sec 3.2 makes reference to Eq (7).

artificial intelligence, implementation, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.74)

Add feedback

On The Presence of Double-Descent in Deep Reinforcement Learning

Veselý, Viktor, Todorov, Aleksandar, Sabatelli, Matthia

arXiv.org Machine LearningNov-11-2025

The double descent (DD) paradox, where over-parameterized models see generalization improve past the interpolation point, remains largely unexplored in the non-stationary domain of Deep Reinforcement Learning (DRL). We present preliminary evidence that DD exists in model-free DRL, investigating it systematically across varying model capacity using the Actor-Critic framework. We rely on an information-theoretic metric, Policy Entropy, to measure policy uncertainty throughout training. Preliminary results show a clear epoch-wise DD curve; the policy's entrance into the second descent region correlates with a sustained, significant reduction in Policy Entropy. This entropic decay suggests that over-parameterization acts as an implicit regularizer, guiding the policy towards robust, flatter minima in the loss landscape. These findings establish DD as a factor in DRL and provide an information-based mechanism for designing agents that are more general, transferable, and robust.

artificial intelligence, deep reinforcement learning, machine learning, (15 more...)

arXiv.org Machine Learning

2511.06895

Genre: Research Report > New Finding (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Improved Robustness of Deep Reinforcement Learning for Control of Time-Varying Systems by Bounded Extremum Seeking

Saxena, Shaifalee, Williams, Alan, Fierro, Rafael, Scheinker, Alexander

arXiv.org Artificial IntelligenceOct-6-2025

In this paper, we study the use of robust model independent bounded extremum seeking (ES) feedback control to improve the robustness of deep reinforcement learning (DRL) controllers for a class of nonlinear time-varying systems. DRL has the potential to learn from large datasets to quickly control or optimize the outputs of many-parameter systems, but its performance degrades catastrophically when the system model changes rapidly over time. Bounded ES can handle time-varying systems with unknown control directions, but its convergence speed slows down as the number of tuned parameters increases and, like all local adaptive methods, it can get stuck in local minima. We demonstrate that together, DRL and bounded ES result in a hybrid controller whose performance exceeds the sum of its parts with DRL taking advantage of historical data to learn how to quickly control a many-parameter system to a desired setpoint while bounded ES ensures its robustness to time variations. We present a numerical study of a general time-varying system and a combined ES-DRL controller for automatic tuning of the Low Energy Beam Transport section at the Los Alamos Neutron Science Center linear particle accelerator.

controller, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2510.0249

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.25)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Research Report (0.82)
Instructional Material > Course Syllabus & Notes (0.34)

Industry:

Energy (1.00)
Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

reviewers, that we will make an implementation of our work available upon publication

Neural Information Processing SystemsOct-3-2025, 03:52:09 GMT

artificial intelligence, implementation, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.74)

Add feedback

Robustness to Adversarial Perturbations in Learning from Incomplete Data

Amir Najafi, Shin-ichi Maeda, Masanori Koyama, Takeru Miyato

Neural Information Processing SystemsOct-2-2025, 20:38:22 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, generalization, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > Canada (0.04)
Asia > Middle East > Iran > Tehran Province > Tehran (0.04)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

The 1st International Workshop on Disentangled Representation Learning for Controllable Generation (DRL4Real): Methods and Results

Chen, Qiuyu, Jin, Xin, Song, Yue, Liu, Xihui, Yang, Shuai, Yang, Tao, Li, Ziqiang, Huang, Jianguo, Wei, Yuntao, Xie, Ba'ao, Sebe, Nicu, Wenjun, null, Zeng, null, Yun, Jooyeol, Abati, Davide, Omran, Mohamed, Choo, Jaegul, Habibian, Amir, Wiggers, Auke, Kobayashi, Masato, Ding, Ning, Tamaki, Toru, Gheisari, Marzieh, Genovesio, Auguste, Chen, Yuheng, Liu, Dingkun, Yang, Xinyao, Xu, Xinping, Chen, Baicheng, Wu, Dongrui, Geng, Junhao, Lv, Lexiang, Lin, Jianxin, Liang, Hanzhe, Zhou, Jie, Chen, Xuanxin, Wang, Jinbao, Gao, Can, Wang, Zhangyi, Li, Zongze, Wen, Bihan, Gao, Yixin, Pan, Xiaohan, Li, Xin, Chen, Zhibo, Peng, Baorui, Chen, Zhongming, Jin, Haoran

arXiv.org Artificial IntelligenceSep-16-2025

This paper reviews the 1st International Workshop on Disentangled Representation Learning for Controllable Generation (DRL4Real), held in conjunction with ICCV 2025. The workshop aimed to bridge the gap between the theoretical promise of Disentangled Representation Learning (DRL) and its application in realistic scenarios, moving beyond synthetic benchmarks. DRL4Real focused on evaluating DRL methods in practical applications such as controllable generation, exploring advancements in model robustness, interpretability, and generalization. The workshop accepted 9 papers covering a broad range of topics, including the integration of novel inductive biases (e.g., language), the application of diffusion models to DRL, 3D-aware disentanglement, and the expansion of DRL into specialized domains like autonomous driving and EEG analysis. This summary details the workshop's objectives, the themes of the accepted papers, and provides an overview of the methodologies proposed by the authors.

ieee cvf international conference, large language model, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2509.10463

Genre: Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)

Add feedback