AITopics | Agents

01d78b294d80491fecddea897cf03642-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 07:38:19 GMT

agent, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Offline Multi-Agent Reinforcement Learning with Knowledge Distillation

Neural Information Processing SystemsApr-24-2026, 07:38:15 GMT

We introduce an offline multi-agent reinforcement learning (offline MARL) framework that utilizes previously collected data without additional online data collection. Our method reformulates offline MARL as a sequence modeling problem and thus builds on top of the simplicity and scalability of the Transformer architecture. In the fashion of centralized training and decentralized execution, we propose to first train a teacher policy who has the privilege to access every agent's observations, actions, and rewards. After the teacher policy has identified and recombined the "good" behavior in the dataset, we create separate student policies and distill not only the teacher policy's features but also its structural relations among different agents' features to student policies. We show that our framework significantly improves performances on a range of tasks and outperforms state-of-the-art offline MARL baselines. Furthermore, we demonstrate that the proposed method has a better convergence rate, is more sample efficient, and is more robust to various demonstration qualities compared with baselines.

distillation, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America (0.28)

Genre: Research Report (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

On Sample Optimality in Personalized Collaborative and Federated Learning

Neural Information Processing SystemsApr-24-2026, 07:38:01 GMT

In personalized federated learning, each member of a potentially large set of agents aims to train a model minimizing its loss function averaged over its local data distribution. We study this problem under the lens of stochastic optimization, focusing on a scenario with a large number of agents, that each possess very few data samples from their local data distribution. Specifically, we prove novel matching lower and upper bounds on the number of samples required from all agents to approximately minimize the generalization error of a fixed agent. We provide strategies matching these lower bounds, based on a gradient filtering approach: given prior knowledge on some notion of distance between local data distributions, agents filter and aggregate stochastic gradients received from other agents, in order to achieve an optimal bias-variance trade-off. Finally, we quantify the impact of using rough estimations of the distances between local distributions of agents, based on a very small number of local samples.

agent, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.37)

Add feedback

01cea7793f3c68af2e4989fc66bf8fb0-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 07:37:58 GMT

artificial intelligence, machine learning, optimization problem, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

An AI agent takes over a store and orders too many candles

The Japan TimesApr-24-2026, 05:39:00 GMT

Andon Market in San Francisco represents a vision, however flawed, of a future when more sophisticated AI agents take over work traditionally done by humans. In San Francisco's upscale Cow Hollow district, the introduction of a boutique selling coffee table games, tote bags and other household items would be pretty unremarkable. However, Andon Market has one key differentiator: It's run by AI. At this store, an artificial intelligence agent named Luna effectively acts as the chief executive officer of the operation. It decides what products to offer and how much to charge for them.

artificial intelligence, iran war earthquake sanae takaichi, social media, (8 more...)

The Japan Times

Country:

North America > United States > California > San Francisco County > San Francisco (0.46)
Asia > Middle East > Iran (0.42)
Asia > Philippines (0.16)
(4 more...)

Industry:

Government > Military (0.36)
Media > News (0.31)
Leisure & Entertainment (0.31)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Communications > Social Media (0.78)

Add feedback

Appendix Gigastep - One Billion Steps per Second Multi-agent Reinforcement Learning

Neural Information Processing SystemsApr-24-2026, 04:43:02 GMT

In this section, we train policies for different scenarios to validate that the tasks defined in Gigastep can be solved with multi-agent RL algorithms. In particular, we use multi-agent PPO implemented in JAX. In competitive or adversarial MARL, an objective reward measure is not defined, as the collected reward inherently depends on the relative strength of the opposing agent's policy. Therefore, to measure the training progress, we compare the current policy with previous checkpoints of the same policy at earlier training iterations. Specifically, an improving policy should be able to outperform its previous counterparts.

checkpoint, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.32)

Technology: