AITopics | exploration cost

Collaborating Authors

exploration cost

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AgensFlow: A Coordination-Policy Substrate for Multi-Agent Systems

Koenigstein, Nicole

arXiv.org Machine LearningMay-28-2026

Multi-agent systems built on large language models (LLMs) require many coordination choices that are difficult to fix a priori: which skill protocol to invoke, which agent role should perform a subtask, which model to bind to each role, how roles should interact, when to use retrieval or verification, and when to omit a step entirely. These choices interact with task regime and operational constraints, so static pipelines and one-off model comparisons provide only a limited view of the design space. This paper introduces AgensFlow, an open-source framework that treats multi-agent coordination as an online policy-learning problem under partial observability. The framework makes coordination decisions observable and learnable from repeated trajectories, rather than treating skill, role, model, topology, and evaluation choices as fixed pipeline design. AgensFlow is evaluated on two corpora: distributed-systems incident tasks and security-advisory tasks. The evaluation shows three main results: learned routing reaches a higher-quality operating point than a fixed pipeline baseline on coordination-heavy classes; skip:X isolates topology compression as a meaningful part of the substrate; and warm-started policy graphs can reduce exploration cost while preserving plateau quality. Overall, the results support that learned, auditable routing can improve coordination-heavy multi-agent workflows over static wiring.

artificial intelligence, policy graph, signature, (15 more...)

arXiv.org Machine Learning

2605.27466

Genre: Workflow (0.88)

Industry:

Information Technology > Security & Privacy (0.34)
Education (0.34)
Health & Medicine (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Explicit Planning for Efficient Exploration in Reinforcement Learning

Liangpeng Zhang, Ke Tang, Xin Yao

Neural Information Processing SystemsFeb-19-2026, 12:35:05 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

Country:

Europe > United Kingdom (0.14)
Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report (0.94)

Industry:

Health & Medicine (0.68)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Collaboration (0.65)
Information Technology > Data Science > Data Mining > Big Data (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.31)

Add feedback

5b9bef4eae0f574cedbf9f4bf29d8ae7-Supplemental-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 15:49:57 GMT

agent, algorithm, identification, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

Explicit Planning for Efficient Exploration in Reinforcement Learning

Liangpeng Zhang, Ke Tang, Xin Yao

Neural Information Processing SystemsOct-9-2025, 13:59:53 GMT

A straightforward example is as follows.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.69)
Asia > China > Guangdong Province (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

Including such an analysis

Neural Information Processing SystemsOct-2-2025, 16:08:45 GMT

This is a clear example of exploration-then-exploitation behaviour with exactly one phase change in the process.

artificial intelligence, reviewer, reward trap, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.31)

Add feedback

5b9bef4eae0f574cedbf9f4bf29d8ae7-Paper-Conference.pdf

Neural Information Processing SystemsAug-15-2025, 02:20:10 GMT

agent, algorithm, identification, (10 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom (0.14)
Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report (0.94)

Industry:

Health & Medicine (0.68)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.69)
Information Technology > Data Science > Data Mining > Big Data (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.31)

Add feedback

Reviews: Safe Exploration for Interactive Machine Learning

Neural Information Processing SystemsJan-23-2025, 14:35:58 GMT

This paper considers the safe exploration problem in both (Bayesian, Gaussian Process) optimization and reinforcement learning settings. In this work, as with some previous works, which states are safe is treated as unknown, but it is assumed that safety is determined by a sufficiently smooth constraint function, so that evaluating (exploring) a point may be adequate to ensure that nearby points are also safe on account of smoothness. Perhaps the most significant aspect of this work is the way the problem is formulated. Some previous works allowed unsafe exploration, provided that a near-optimal safe point could be identified; other works treated safe exploration as the sole objective, with finding the optimal point within the safe region as an afterthought. The former model is inappropriate for many reinforcement learning applications in which the learning may happen on-line in a live robotic platform and safety must be ensured during execution; the latter model is simply inefficient, which is in a sense the focus of the evaluation in this work.

algorithm, interactive machine learning, optimization algorithm, (10 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.58)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Add feedback