AITopics | Agents

075b2875e2b671ddd74aeec0ac9f0357-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 09:50:00 GMT

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.68)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Queue Up Your Regrets: Achieving the Dynamic Capacity Region of Multiplayer Bandits

Neural Information Processing SystemsApr-24-2026, 09:17:13 GMT

Consider N cooperative agents such that for T turns, each agent n takes an action an and receives a stochastic reward rn (a1,...,aN). Agents cannot observe the actions of other agents and do not know even their own reward function. The agents can communicate with their neighbors on a connected graph Gwith diameter d(G). We want each agent nto achieve an expected average reward of at least λn over time, for a given quality of service (QoS) vector λ. AQoS vector λis not necessarily achievable.

agent, algorithm, artificial intelligence, (15 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)

Add feedback

Collaborative Learning via Prediction Consensus

Neural Information Processing SystemsApr-24-2026, 09:17:01 GMT

We consider a collaborative learning setting where the goal of each agent is to improve their own model by leveraging the expertise of collaborators, in addition to their own training data. To facilitate the exchange of expertise among agents, we propose a distillation-based method leveraging shared unlabeled auxiliary data, which is pseudo-labeled by the collective. Central to our method is a trust weighting scheme that serves to adaptively weigh the influence of each collaborator on the pseudo-labels until a consensus on how to label the auxiliary data is reached. We demonstrate empirically that our collaboration scheme is able to significantly boost individual models' performance in the target domain from which the auxiliary data is sampled. At the same time, it can provably mitigate the negative impact of bad models on the collective. By design, our method adeptly accommodates heterogeneity in model architectures and substantially reduces communication overhead compared to typical collaborative learning methods.

agent, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Communications > Collaboration (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.90)

Add feedback

Interview with Deepika Vemuri: interpretability and concept-based learning

AIHubApr-24-2026, 08:59:33 GMT

The latest interview in our series with the AAAI/SIGAI Doctoral Consortium participants features Deepika Vemuri who is working on interpretability and concept-based learning. We found out more about the two aspects of concept-based models that she's been researching. Could you tell us a bit about your PhD - where are you studying, and what is the topic of your research? I'm a PhD student from IIT Hyderabad working with Dr Vineeth N Balasubramanian, supported by the PMRF Fellowship. Most current state-of-the-art models are black boxes, which is especially problematic when these models are used in high-stakes applications like criminal justice and healthcare, where people's lives depend on the decisions of these models.

artificial intelligence, learning, machine learning, (13 more...)

AIHub

Country: North America > Canada (0.04)

Genre: Personal > Interview (1.00)

Industry: Leisure & Entertainment > Sports (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.48)

Add feedback

VLMbench: ACompositional Benchmark for Vision-and-Language Manipulation

Neural Information Processing SystemsApr-24-2026, 08:55:05 GMT

Benefiting from language flexibility and compositionality, humans naturally intend to use language to command an embodied agent for complex tasks such as navigation and object manipulation. In this work, we aim to fill the blank of the last mile of embodied agents--object manipulation by following human guidance, e.g., "move the red mug next to the box while keeping it upright." To this end, we introduce an Automatic Manipulation Solver (AMSolver) system and build a Vision-and-Language Manipulation benchmark (VLMbench) based on it, containing various language instructions on categorized robotic manipulation tasks. Specifically, modular rule-based task templates are created to automatically generate robot demonstrations with language instructions, consisting of diverse object shapes and appearances, action types, and motion constraints. We also develop a keypoint-based model 6D-CLIPort to deal with multi-view observations and language input and output a sequence of 6 degrees of freedom (DoF) actions. We hope the new simulator and benchmark will facilitate future research on language-guided robotic manipulation.

artificial intelligence, constraint, language instruction, (12 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report (0.68)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.54)

Add feedback

References

Neural Information Processing SystemsApr-24-2026, 08:13:30 GMT

On the construction of almost uniformly convergent random variables with given weakly convergent image laws.

artificial intelligence, machine learning, proceedings, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Add feedback

01db36a646c07c64dd39a92b4eceb417-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 07:38:37 GMT

artificial intelligence, machine learning, optimization, (15 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.68)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

Add feedback

01d78b294d80491fecddea897cf03642-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 07:38:19 GMT

agent, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Offline Multi-Agent Reinforcement Learning with Knowledge Distillation

Neural Information Processing SystemsApr-24-2026, 07:38:15 GMT

We introduce an offline multi-agent reinforcement learning (offline MARL) framework that utilizes previously collected data without additional online data collection. Our method reformulates offline MARL as a sequence modeling problem and thus builds on top of the simplicity and scalability of the Transformer architecture. In the fashion of centralized training and decentralized execution, we propose to first train a teacher policy who has the privilege to access every agent's observations, actions, and rewards. After the teacher policy has identified and recombined the "good" behavior in the dataset, we create separate student policies and distill not only the teacher policy's features but also its structural relations among different agents' features to student policies. We show that our framework significantly improves performances on a range of tasks and outperforms state-of-the-art offline MARL baselines. Furthermore, we demonstrate that the proposed method has a better convergence rate, is more sample efficient, and is more robust to various demonstration qualities compared with baselines.

distillation, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America (0.28)

Genre: Research Report (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

On Sample Optimality in Personalized Collaborative and Federated Learning

Neural Information Processing SystemsApr-24-2026, 07:38:01 GMT

In personalized federated learning, each member of a potentially large set of agents aims to train a model minimizing its loss function averaged over its local data distribution. We study this problem under the lens of stochastic optimization, focusing on a scenario with a large number of agents, that each possess very few data samples from their local data distribution. Specifically, we prove novel matching lower and upper bounds on the number of samples required from all agents to approximately minimize the generalization error of a fixed agent. We provide strategies matching these lower bounds, based on a gradient filtering approach: given prior knowledge on some notion of distance between local data distributions, agents filter and aggregate stochastic gradients received from other agents, in order to achieve an optimal bias-variance trade-off. Finally, we quantify the impact of using rough estimations of the distances between local distributions of agents, based on a very small number of local samples.

agent, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.46)

Technology: