AITopics | independent learning

Collaborating Authors

independent learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Finite-Sample Analysis of Payoff-Based Independent Learning in Zero-Sum Stochastic Games

Neural Information Processing SystemsDec-27-2025, 04:13:37 GMT

In this work, we study two-player zero-sum stochastic games and develop a variant of the smoothed best-response learning dynamics that combines independent learning dynamics for matrix games with the minimax value iteration for stochastic games. The resulting learning dynamics are payoff-based, convergent, rational, and symmetric between the two players.

finite-sample analysis, independent learning, payoff-based independent learning, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.42)

Add feedback

MARL Warehouse Robots

Allman, Price, Thang, Lian, Simmons, Dre, Riaz, Salmon

arXiv.org Artificial IntelligenceDec-10-2025

Our research investigates the complex task of multiple autonomous agents learning to coordinate and deliver packages in warehouse environments--a problem requiring implicit communication, collision avoidance, and efficient task allocation without centralized control. Traditional warehouse automation relies on centralized planning systems that face scalability limitations; multi-agent reinforcement learning (MARL) offers an alternative through decentralized learned policies, but requires solving the credit assignment problem. We compare MARL algorithms on warehouse coordination: QMIX [Rashid et al., 2018] (value decomposition), IPPO (independent learning), and MASAC (centralized critic). Our study progresses from MPE for validation to RWARE for warehouse evaluation, culminating in Unity 3D deployment where agents demonstrate learned package delivery behavior. QMIX emerged as the best performer after systematic comparison. Our contributions: (1) hyperparameter analysis showing default configurations fail on sparse-reward warehouse tasks, (2) comparative evaluation across algorithms and scales, (3) Unity ML-Agents integration demonstrating sim-to-sim transfer with successful package delivery, and (4) identification of scaling challenges. Full experimental details and results are documented in our Quarto documentation book. 1

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2512.04463

Country: North America > United States (0.15)

Genre: Research Report (0.65)

Industry: Transportation > Freight & Logistics Services (0.92)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.92)

Add feedback

Personalized Collaborative Learning with Affinity-Based Variance Reduction

Zhang, Chenyu, Azizan, Navid

arXiv.org Machine LearningOct-21-2025

Multi-agent learning faces a fundamental tension: leveraging distributed collaboration without sacrificing the personalization needed for diverse agents. This tension intensifies when aiming for full personalization while adapting to unknown heterogeneity levels -- gaining collaborative speedup when agents are similar, without performance degradation when they are different. Embracing the challenge, we propose personalized collaborative learning (PCL), a novel framework for heterogeneous agents to collaboratively learn personalized solutions with seamless adaptivity. Through carefully designed bias correction and importance correction mechanisms, our method AffPCL robustly handles both environment and objective heterogeneity. We prove that AffPCL reduces sample complexity over independent learning by a factor of $\max\{n^{-1}, δ\}$, where $n$ is the number of agents and $δ\in[0,1]$ measures their heterogeneity. This affinity-based acceleration automatically interpolates between the linear speedup of federated learning in homogeneous settings and the baseline of independent learning, without requiring prior knowledge of the system. Our analysis further reveals that an agent may obtain linear speedup even by collaborating with arbitrarily dissimilar agents, unveiling new insights into personalization and collaboration in the high heterogeneity regime.

agent, artificial intelligence, machine learning, (13 more...)

arXiv.org Machine Learning

2510.16232

Country:

North America > United States (0.28)
Europe (0.27)

Genre: Research Report (0.40)

Industry:

Education (0.68)
Transportation (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)

Add feedback

Independent Learning in Performative Markov Potential Games

Sahitaj, Rilind, Sasnauskas, Paulius, Yalın, Yiğit, Mandal, Debmalya, Radanović, Goran

arXiv.org Artificial IntelligenceApr-30-2025

Performative Reinforcement Learning (PRL) refers to a scenario in which the deployed policy changes the reward and transition dynamics of the underlying environment. In this work, we study multi-agent PRL by incorporating performative effects into Markov Potential Games (MPGs). We introduce the notion of a performatively stable equilibrium (PSE) and show that it always exists under a reasonable sensitivity assumption. We then provide convergence results for state-of-the-art algorithms used to solve MPGs. Specifically, we show that independent policy gradient ascent (IPGA) and independent natural policy gradient (INPG) converge to an approximate PSE in the best-iterate sense, with an additional term that accounts for the performative effects. Furthermore, we show that INPG asymptotically converges to a PSE in the last-iterate sense. As the performative effects vanish, we recover the convergence rates from prior work. For a special case of our game, we provide finite-time last-iterate convergence results for a repeated retraining approach, in which agents independently optimize a surrogate objective. We conduct extensive experiments to validate our theoretical findings.

artificial intelligence, machine learning, performative effect, (14 more...)

arXiv.org Artificial Intelligence

2504.20593

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Unlocking Learning Potentials: The Transformative Effect of Generative AI in Education Across Grade Levels

Xie, Meijuan, Luo, Liling

arXiv.org Artificial IntelligenceMar-15-2025

The advent of generative artificial intelligence (GAI) has brought about a notable surge in the field of education. The use of GAI to support learning is becoming increasingly prevalent among students. However, the manner and extent of its utilisation vary considerably from one individual to another. And researches about student's utilisation and perceptions of GAI remains relatively scarce. To gain insight into the issue, this paper proposed a hybrid-survey method to examine the impact of GAI on students across four different grades in six key areas (LIPSAL): learning interest, independent learning, problem solving, self-confidence, appropriate use, and learning enjoyment. Firstly, through questionnaire, we found that among LIPSAL, GAI has the greatest impact on the concept of appropriate use, the lowest level of learning interest and self-confidence. Secondly, a comparison of four grades revealed that the high and low factors of LIPSAL exhibited grade-related variation, and college students exhibited a higher level than high school students across LIPSAL. Thirdly, through interview, the students demonstrated a comprehensive understanding of the application of GAI. We found that students have a positive attitude towards GAI and are very willing to use it, which is why GAI has grown so rapidly in popularity. They also told us prospects and challenges in using GAI. In the future, as GAI matures technologically, it will have an greater impact on students. These findings may help better understand usage by different students and inform future research in digital education.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.13535

Country:

Asia > China > Beijing > Beijing (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.68)
Research Report > Experimental Study (0.68)

Industry:

Education > Curriculum > Subject-Specific Education (1.00)
Education > Educational Setting > K-12 Education > Secondary School (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.85)

Add feedback

Review for NeurIPS paper: Independent Policy Gradient Methods for Competitive Reinforcement Learning

Neural Information Processing SystemsJan-23-2025, 14:36:19 GMT

Weaknesses: I am not convinced by the main motivation of this paper for decoupled or independent learning. Specifically, from the communication perspective, once agents can also communicate the actions each other took per round, then each agent can also simulate any coupled algorithm locally (or only coupled online algorithm if has storage limitation). Since agents have to communicate with the oracle or environment in each round anyway, I don't see in practice why communicate the actions in the learning process is that problematic. Second, this paper says that the independent learning is important because it allows the algorithm "being versatile, being applicable even in uncertain environments where the type of interaction and number of other agents are not known to the agent. " I feel this description does not fit the algorithm studied in this paper, thus a bit misleading.

agent, competitive reinforcement learning, independent policy gradient method, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

Approximate Global Convergence of Independent Learning in Multi-Agent Systems

Jin, Ruiyang, Chen, Zaiwei, Lin, Yiheng, Song, Jie, Wierman, Adam

arXiv.org Artificial IntelligenceMay-30-2024

Independent learning (IL), despite being a popular approach in practice to achieve scalability in large-scale multi-agent systems, usually lacks global convergence guarantees. In this paper, we study two representative algorithms, independent $Q$-learning and independent natural actor-critic, within value-based and policy-based frameworks, and provide the first finite-sample analysis for approximate global convergence. The results imply a sample complexity of $\tilde{\mathcal{O}}(\epsilon^{-2})$ up to an error term that captures the dependence among agents and characterizes the fundamental limit of IL in achieving global convergence. To establish the result, we develop a novel approach for analyzing IL by constructing a separable Markov decision process (MDP) for convergence analysis and then bounding the gap due to model difference between the separable MDP and the original one. Moreover, we conduct numerical experiments using a synthetic MDP and an electric vehicle charging example to verify our theoretical findings and to demonstrate the practical applicability of IL.

approximate global convergence, independent learning, multi-agent system, (1 more...)

arXiv.org Artificial Intelligence

2405.19811

Country: Europe > United Kingdom (0.04)

Genre: Research Report (0.69)

Industry:

Transportation > Ground > Road (0.53)
Transportation > Electric Vehicle (0.53)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Independent Learning in Constrained Markov Potential Games

Jordan, Philip, Barakat, Anas, He, Niao

arXiv.org Artificial IntelligenceFeb-27-2024

Constrained Markov games offer a formal mathematical framework for modeling multi-agent reinforcement learning problems where the behavior of the agents is subject to constraints. In this work, we focus on the recently introduced class of constrained Markov Potential Games. While centralized algorithms have been proposed for solving such constrained games, the design of converging independent learning algorithms tailored for the constrained setting remains an open question. We propose an independent policy gradient algorithm for learning approximate constrained Nash equilibria: Each agent observes their own actions and rewards, along with a shared state. Inspired by the optimization literature, our algorithm performs proximal-point-like updates augmented with a regularized constraint set. Each proximal step is solved inexactly using a stochastic switching gradient algorithm. Notably, our algorithm can be implemented independently without a centralized coordination mechanism requiring turn-based agent updates. Under some technical constraint qualification conditions, we establish convergence guarantees towards constrained approximate Nash equilibria. We perform simulations to illustrate our results.

algorithm, constraint, inequality, (14 more...)

arXiv.org Artificial Intelligence

2402.17885

Country:

Asia > Middle East > Jordan (0.05)
Europe > Switzerland > Zürich > Zürich (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Power Industry (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.86)

Add feedback

Dealing With Non-stationarity in Decentralized Cooperative Multi-Agent Deep Reinforcement Learning via Multi-Timescale Learning

Nekoei, Hadi, Badrinaaraayanan, Akilesh, Sinha, Amit, Amini, Mohammad, Rajendran, Janarthanan, Mahajan, Aditya, Chandar, Sarath

arXiv.org Artificial IntelligenceAug-17-2023

Decentralized cooperative multi-agent deep reinforcement learning (MARL) can be a versatile learning framework, particularly in scenarios where centralized training is either not possible or not practical. One of the critical challenges in decentralized deep MARL is the non-stationarity of the learning environment when multiple agents are learning concurrently. A commonly used and efficient scheme for decentralized MARL is independent learning in which agents concurrently update their policies independently of each other. We first show that independent learning does not always converge, while sequential learning where agents update their policies one after another in a sequence is guaranteed to converge to an agent-by-agent optimal solution. In sequential learning, when one agent updates its policy, all other agent's policies are kept fixed, alleviating the challenge of non-stationarity due to simultaneous updates in other agents' policies. However, it can be slow because only one agent is learning at any time. Therefore it might also not always be practical. In this work, we propose a decentralized cooperative MARL algorithm based on multi-timescale learning. In multi-timescale learning, all agents learn simultaneously, but at different learning rates. In our proposed method, when one agent updates its policy, other agents are allowed to update their policies as well, but at a slower rate. This speeds up sequential learning, while also minimizing non-stationarity caused by other agents updating concurrently. Multi-timescale learning outperforms state-of-the-art decentralized learning methods on a set of challenging multi-agent cooperative tasks in the epymarl(Papoudakis et al., 2020) benchmark. This can be seen as a first step towards more general decentralized cooperative deep MARL methods based on multi-timescale learning.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2302.02792

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.82)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.84)

Add feedback

Stochastic Market Games

Schmid, Kyrill, Belzner, Lenz, Müller, Robert, Tochtermann, Johannes, Linnhoff-Popien, Claudia

arXiv.org Artificial IntelligenceJul-19-2022

Some of the most relevant future applications of multi-agent systems like autonomous driving or factories as a service display mixed-motive scenarios, where agents might have conflicting goals. In these settings agents are likely to learn undesirable outcomes in terms of cooperation under independent learning, such as overly greedy behavior. Motivated from real world societies, in this work we propose to utilize market forces to provide incentives for agents to become cooperative. As demonstrated in an iterated version of the Prisoner's Dilemma, the proposed market formulation can change the dynamics of the game to consistently learn cooperative policies. Further we evaluate our approach in spatially and temporally extended settings for varying numbers of agents. We empirically find that the presence of markets can improve both the overall result and agent individual returns via their trading activities.

agent, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.24963/ijcai.2021/54

2207.07388

Country:

North America > United States > New York (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Ingolstadt (0.04)

Genre: Research Report (0.40)

Industry:

Banking & Finance > Trading (0.67)
Law (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback