AITopics | drl agent

Collaborating Authors

drl agent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

c4bf73386022473a652a18941e9ea6f8-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 23:51:49 GMT

machine learning, natural language, reinforcement learning, (22 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Evanston (0.04)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
(2 more...)

Genre: Research Report > New Finding (0.92)

Industry:

Leisure & Entertainment > Sports (1.00)
Leisure & Entertainment > Games > Computer Games (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
(3 more...)

Add feedback

a914ecef9c12ffdb9bede64bb703d877-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 17:56:05 GMT

drl agent, pruning, sparsity ratio, (15 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

a914ecef9c12ffdb9bede64bb703d877-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-9-2026, 17:55:55 GMT

iteration, reviewer, sparsity, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

Online Learning-based Adaptive Beam Switching for 6G Networks: Enhancing Efficiency and Resilience

Natanzi, Seyed Bagher Hashemi, Zhu, Zhicong, Tang, Bo

arXiv.org Artificial IntelligenceDec-4-2025

Adaptive beam switching is essential for mission-critical military and commercial 6G networks but faces major challenges from high carrier frequencies, user mobility, and frequent blockages. While existing machine learning (ML) solutions often focus on maximizing instantaneous throughput, this can lead to unstable policies with high signaling overhead. This paper presents an online Deep Reinforcement Learning (DRL) framework designed to learn an operationally stable policy. By equipping the DRL agent with an enhanced state representation that includes blockage history, and a stability-centric reward function, we enable it to prioritize long-term link quality over transient gains. Validated in a challenging 100-user scenario using the Sionna library, our agent achieves throughput comparable to a reactive Multi-Armed Bandit (MAB) baseline. Specifically, our proposed framework improves link stability by approximately 43% compared to a vanilla DRL approach, achieving operational reliability competitive with MAB while maintaining high data rates. This work demonstrates that by reframing the optimization goal towards operational stability, DRL can deliver efficient, reliable, and real-time beam management solutions for next-generation mission-critical networks.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2505.08032

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry: Education > Educational Setting > Online (0.43)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Transformer-Guided Deep Reinforcement Learning for Optimal Takeoff Trajectory Design of an eVTOL Drone

Roberts, Nathan M. II, Du, Xiaosong

arXiv.org Artificial IntelligenceNov-20-2025

The rapid advancement of electric vertical take-off and landing (eVTOL) aircraft offers a promising opportunity to alleviate urban traffic congestion. Thus, developing optimal takeoff trajectories for minimum energy consumption becomes essential for broader eVTOL aircraft applications. Conventional optimal control methods (such as dynamic programming and linear quadratic regulator) provide highly efficient and well-established solutions but are limited by problem dimensionality and complexity. Deep reinforcement learning (DRL) emerges as a special type of artificial intelligence tackling complex, nonlinear systems; however, the training difficulty is a key bottleneck that limits DRL applications. To address these challenges, we propose the transformer-guided DRL to alleviate the training difficulty by exploring a realistic state space at each time step using a transformer. The proposed transformer-guided DRL was demonstrated on an optimal takeoff trajectory design of an eVTOL drone for minimal energy consumption while meeting takeoff conditions (i.e., minimum vertical displacement and minimum horizontal velocity) by varying control variables (i.e., power and wing angle to the vertical). Results presented that the transformer-guided DRL agent learned to take off with $4.57\times10^6$ time steps, representing 25% of the $19.79\times10^6$ time steps needed by a vanilla DRL agent. In addition, the transformer-guided DRL achieved 97.2% accuracy on the optimal energy consumption compared against the simulation-based optimal reference while the vanilla DRL achieved 96.3% accuracy. Therefore, the proposed transformer-guided DRL outperformed vanilla DRL in terms of both training efficiency as well as optimal design verification.

machine learning, reinforcement learning, trajectory, (16 more...)

arXiv.org Artificial Intelligence

2511.14887

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Industry:

Transportation > Air (1.00)
Energy (1.00)
Aerospace & Defense (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Explainable RL Policies by Distilling to Locally-Specialized Linear Policies with Voronoi State Partitioning

Deproost, Senne, Steckelmacher, Dennis, Nowé, Ann

arXiv.org Artificial IntelligenceNov-18-2025

Deep Reinforcement Learning is one of the state-of-the-art methods for producing near-optimal system controllers. However, deep RL algorithms train a deep neural network, that lacks transparency, which poses challenges when the controller has to meet regulations, or foster trust. To alleviate this, one could transfer the learned behaviour into a model that is human-readable by design using knowledge distilla- tion. Often this is done with a single model which mimics the original model on average but could struggle in more dynamic situations. A key challenge is that this simpler model should have the right balance be- tween flexibility and complexity or right balance between balance bias and accuracy. We propose a new model-agnostic method to divide the state space into regions where a simplified, human-understandable model can operate in. In this paper, we use Voronoi partitioning to find regions where linear models can achieve similar performance to the original con- troller. We evaluate our approach on a gridworld environment and a classic control task. We observe that our proposed distillation to locally- specialized linear models produces policies that are explainable and show that the distillation matches or even slightly outperforms the black-box policy they are distilled from.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2511.13322

Country:

Europe (1.00)
North America > United States > California (0.46)

Genre: Research Report > Promising Solution (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Modulation of temporal decision-making in a deep reinforcement learning agent under the dual-task paradigm

Pednekar, Amrapali, Garrido-Pérez, Álvaro, Khaluf, Yara, Simoens, Pieter

arXiv.org Artificial IntelligenceNov-4-2025

This study explores the interference in temporal processing within a dual-task paradigm from an artificial intelligence (AI) perspective. In this context, the dual-task setup is implemented as a simplified version of the Overcooked environment with two variations, single task (T) and dual task (T+N). Both variations involve an embedded time production task, but the dual task (T+N) additionally involves a concurrent number comparison task. Two deep reinforcement learning (DRL) agents were separately trained for each of these tasks. These agents exhibited emergent behavior consistent with human timing research. Specifically, the dual task (T+N) agent exhibited significant overproduction of time relative to its single task (T) counterpart. This result was consistent across four target durations. Preliminary analysis of neural dynamics in the agents' LSTM layers did not reveal any clear evidence of a dedicated or intrinsic timer. Hence, further investigation is needed to better understand the underlying time-keeping mechanisms of the agents and to provide insights into the observed behavioral patterns. This study is a small step towards exploring parallels between emergent DRL behavior and behavior observed in biological systems in order to facilitate a better understanding of both.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2511.01415

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Using Kolmogorov-Smirnov Distance for Measuring Distribution Shift in Machine Learning

Tonguz, Ozan K., Taschin, Federico

arXiv.org Artificial IntelligenceOct-21-2025

One of the major problems in Machine Learning (ML) and Artificial Intelligence (AI) is the fact that the probability distribution of the test data in the real world could deviate substantially from the probability distribution of the training data set. When this happens, the predictions of an ML system or an AI agent could involve large errors which is very troublesome and undesirable. While this is a well-known hard problem plaguing the AI and ML systems' accuracy and reliability, in certain applications such errors could be critical for safety and reliability of AI and ML systems. One approach to deal with this problem is to monitor and measure the deviation in the probability distribution of the test data in real time and to compensate for this deviation. In this paper, we propose and explore the use of Kolmogorov-Smirnov (KS) Test for measuring the distribution shift and we show how the KS distance can be used to quantify the distribution shift and its impact on an AI agent's performance. Our results suggest that KS distance could be used as a valuable statistical tool for monitoring and measuring the distribution shift. More specifically, it is shown that even a distance of KS=0.02 could lead to about 50\% increase in the travel time at a single intersection using a Reinforcement Learning agent which is quite significant. It is hoped that the use of KS Test and KS distance in AI-based smart transportation could be an important step forward for gauging the performance degradation of an AI agent in real time and this, in turn, could help the AI agent to cope with the distribution shift in a more informed manner.

ks distance, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2510.15996

Country: North America > United States > Utah (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Transportation > Ground > Road (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

The Pursuit of Diversity: Multi-Objective Testing of Deep Reinforcement Learning Agents

Bartlett, Antony, Liem, Cynthia, Panichella, Annibale

arXiv.org Artificial IntelligenceOct-17-2025

Testing deep reinforcement learning (DRL) agents in safety-critical domains requires discovering diverse failure scenarios. Existing tools such as INDAGO rely on single-objective optimization focused solely on maximizing failure counts, but this does not ensure discovered scenarios are diverse or reveal distinct error types. We introduce INDAGO-Nexus, a multi-objective search approach that jointly optimizes for failure likelihood and test scenario diversity using multi-objective evolutionary algorithms with multiple diversity metrics and Pareto front selection strategies. We evaluated INDAGO-Nexus on three DRL agents: humanoid walker, self-driving car, and parking agent. On average, INDAGO-Nexus discovers up to 83% and 40% more unique failures (test effectiveness) than INDAGO in the SDC and Parking scenarios, respectively, while reducing time-to-failure by up to 67% across all agents.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2510.14727

Country: North America > United States (0.14)

Genre:

Research Report > Experimental Study (0.70)
Research Report > New Finding (0.69)

Industry: Transportation > Ground > Road (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Filters

Collaborating Authors

drl agent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

c4bf73386022473a652a18941e9ea6f8-Paper-Conference.pdf

acab0116c354964a558e65bdd07ff047-Paper.pdf

a914ecef9c12ffdb9bede64bb703d877-Paper.pdf

a914ecef9c12ffdb9bede64bb703d877-AuthorFeedback.pdf

Online Learning-based Adaptive Beam Switching for 6G Networks: Enhancing Efficiency and Resilience

Transformer-Guided Deep Reinforcement Learning for Optimal Takeoff Trajectory Design of an eVTOL Drone

Explainable RL Policies by Distilling to Locally-Specialized Linear Policies with Voronoi State Partitioning

Modulation of temporal decision-making in a deep reinforcement learning agent under the dual-task paradigm

Using Kolmogorov-Smirnov Distance for Measuring Distribution Shift in Machine Learning

The Pursuit of Diversity: Multi-Objective Testing of Deep Reinforcement Learning Agents