AITopics | independent rl

Collaborating Authors

independent rl

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Large Language Model Integration with Reinforcement Learning to Augment Decision-Making in Autonomous Cyber Operations

Tholl, Konur, Rivest, François, Mezouar, Mariam El, Mallah, Ranwa Al

arXiv.org Artificial IntelligenceSep-9-2025

Reinforcement Learning (RL) has shown great potential for autonomous decision-making in the cybersecurity domain, enabling agents to learn through direct environment interaction. However, RL agents in Autonomous Cyber Operations (ACO) typically learn from scratch, requiring them to execute undesirable actions to learn their consequences. In this study, we integrate external knowledge in the form of a Large Language Model (LLM) pretrained on cybersecurity data that our RL agent can directly leverage to make informed decisions. By guiding initial training with an LLM, we improve baseline performance and reduce the need for exploratory actions with obviously negative outcomes. We evaluate our LLM-integrated approach in a simulated cybersecurity environment, and demonstrate that our guided agent achieves over 2x higher rewards during early training and converges to a favorable policy approximately 4,500 episodes faster than the baseline.

large language model, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2509.05311

Country: North America > Canada > Ontario (0.28)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.90)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Comparative Evaluation of Teacher-Guided Reinforcement Learning Techniques for Autonomous Cyber Operations

Tholl, Konur, Mezouar, Mariam El, Mallah, Ranwa Al

arXiv.org Artificial IntelligenceAug-21-2025

Autonomous Cyber Operations (ACO) rely on Reinforcement Learning (RL) to train agents to make effective decisions in the cybersecurity domain. However, existing ACO applications require agents to learn from scratch, leading to slow convergence and poor early-stage performance. While teacher-guided techniques have demonstrated promise in other domains, they have not yet been applied to ACO. In this study, we implement four distinct teacher-guided techniques in the simulated CybORG environment and conduct a comparative evaluation. Our results demonstrate that teacher integration can significantly improve training efficiency in terms of early policy performance and convergence speed, highlighting its potential benefits for autonomous cybersecurity.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2508.1434

Country: North America > Canada > Ontario > Kingston (0.29)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (0.56)
Government > Military > Cyberwarfare (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Reviews: A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning

Neural Information Processing SystemsOct-7-2024, 17:32:13 GMT

Summary: "A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning" presents a novel scalable algorithm that is shown to converge to better behaviours in partially-observable Multi-Agent Reinforcement Learning scenarios compared to previous methods. The paper begins with describing the problem, mainly that training reinforcement learning agents independently (i.e. each agent ignores the behaviours of the other agents and treats them as part of the environment) results in policies which can significantly overfit to only the agent behaviours observed during training time, failing to generalize when later set against new opponent behaviours. The paper then describes its solution, a generalization of the Double Oracle algorithm. The algorithm works using the following process: first, given a set of initial policies for each player, an empirical payoff tensor is created and from that a meta-strategy is learnt for each player which is the mixture over that initial policy set which achieves the highest value. Then each player i in the game is iterated, and a new policy is trained against policies sampled from the meta-strategies of the other agents not equal to i.

algorithm, multiagent reinforcement learning, unified game-theoretic approach, (11 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Independent RL for Cooperative-Competitive Agents: A Mean-Field Perspective

Zaman, Muhammad Aneeq uz, Koppel, Alec, Laurière, Mathieu, Başar, Tamer

arXiv.org Artificial IntelligenceMar-17-2024

We address in this paper Reinforcement Learning (RL) among agents that are grouped into teams such that there is cooperation within each team but general-sum (non-zero sum) competition across different teams. To develop an RL method that provably achieves a Nash equilibrium, we focus on a linear-quadratic structure. Moreover, to tackle the non-stationarity induced by multi-agent interactions in the finite population setting, we consider the case where the number of agents within each team is infinite, i.e., the mean-field setting. This results in a General-Sum LQ Mean-Field Type Game (GS-MFTGs). We characterize the Nash equilibrium (NE) of the GS-MFTG, under a standard invertibility condition. This MFTG NE is then shown to be $\mathcal{O}(1/M)$-NE for the finite population game where $M$ is a lower bound on the number of agents in each team. These structural results motivate an algorithm called Multi-player Receding-horizon Natural Policy Gradient (MRPG), where each team minimizes its cumulative cost independently in a receding-horizon manner. Despite the non-convexity of the problem, we establish that the resulting algorithm converges to a global NE through a novel problem decomposition into sub-problems using backward recursive discrete-time Hamilton-Jacobi-Isaacs (HJI) equations, in which independent natural policy gradient is shown to exhibit linear convergence under time-independent diagonal dominance. Experiments illuminate the merits of this approach in practice.

cooperative-competitive game, independent rl, necessary condition, (15 more...)

arXiv.org Artificial Intelligence

2403.11345

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)

Add feedback