AITopics | Agents

Collaborating Authors

Agents

News Overviews Instructional Materials AI-Alerts Classics

An empirical learning-based validation procedure for simulation workflow

Liu, Zhuqing, Lai, Liyuanjun, Zhang, Lin

arXiv.org Machine LearningSep-10-2018

Simulation workflow is a top-level model for the design and control of simulation process. It connects multiple simulation components with time and interaction restrictions to form a complete simulation system. Before the construction and evaluation of the component models, the validation of upper-layer simulation workflow is of the most importance in a simulation system. However, the methods especially for validating simulation workflow is very limit. Many of the existing validation techniques are domain-dependent with cumbersome questionnaire design and expert scoring. Therefore, this paper present an empirical learning-based validation procedure to implement a semi-automated evaluation for simulation workflow. First, representative features of general simulation workflow and their relations with validation indices are proposed. The calculation process of workflow credibility based on Analytic Hierarchy Process (AHP) is then introduced. In order to make full use of the historical data and implement more efficient validation, four learning algorithms, including back propagation neural network (BPNN), extreme learning machine (ELM), evolving new-neuron (eNFN) and fast incremental gaussian mixture model (FIGMN), are introduced for constructing the empirical relation between the workflow credibility and its features. A case study on a landing-process simulation workflow is established to test the feasibility of the proposed procedure. The experimental results also provide some useful overview of the state-of-the-art learning algorithms on the credibility evaluation of simulation models.

artificial intelligence, machine learning, simulation workflow, (15 more...)

arXiv.org Machine Learning

1809.04441

Country: North America > United States (0.28)

Genre: Workflow (1.00)

Industry: Transportation (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines

Schmid, Martin, Burch, Neil, Lanctot, Marc, Moravcik, Matej, Kadlec, Rudolf, Bowling, Michael

arXiv.org Artificial IntelligenceSep-9-2018

Learning strategies for imperfect information games from samples of interaction is a challenging problem. A common method for this setting, Monte Carlo Counterfactual Regret Minimization (MCCFR), can have slow long-term convergence rates due to high variance. In this paper, we introduce a variance reduction technique (VR-MCCFR) that applies to any sampling variant of MCCFR. Using this technique, per-iteration estimated values and updates are reformulated as a function of sampled values and state-action baselines, similar to their use in policy gradient reinforcement learning. The new formulation allows estimates to be bootstrapped from other estimates within the same episode, propagating the benefits of baselines along the sampled trajectory; the estimates remain unbiased even when bootstrapping from other estimates. Finally, we show that given a perfect baseline, the variance of the value estimates can be reduced to zero. Experimental evaluation shows that VR-MCCFR brings an order of magnitude speedup, while the empirical variance decreases by three orders of magnitude. The decreased variance allows for the first time CFR+ to be used with sampling, increasing the speedup to two orders of magnitude.

baseline, counterfactual value, information, (14 more...)

arXiv.org Artificial Intelligence

1809.03057

Country:

North America > United States > Texas (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)

Add feedback

Online Convex Optimization for Sequential Decision Processes and Extensive-Form Games

Farina, Gabriele, Kroer, Christian, Sandholm, Tuomas

arXiv.org Artificial IntelligenceSep-9-2018

Regret minimization is a powerful tool for solving large-scale extensive-form games. State-of-the-art methods rely on minimizing regret locally at each decision point. In this work we derive a new framework for regret minimization on sequential decision problems and extensive-form games with general compact convex sets at each decision point and general convex losses, as opposed to prior work which has been for simplex decision points and linear losses. We call our framework laminar regret decomposition. It generalizes the CFR algorithm to this more general setting. Furthermore, our framework enables a new proof of CFR even in the known setting, which is derived from a perspective of decomposing polytope regret, thereby leading to an arguably simpler interpretation of the algorithm. Our generalization to convex compact sets and convex losses allows us to develop new algorithms for several problems: regularized sequential decision making, regularized Nash equilibria in extensive-form games, and computing approximate extensive-form perfect equilibria. Our generalization also leads to the first regret-minimization algorithm for computing reduced-normal-form quantal response equilibria based on minimizing local regrets. Experiments show that our framework leads to algorithms that scale at a rate comparable to the fastest variants of counterfactual regret minimization for computing Nash equilibrium, and therefore our approach leads to the first algorithm for computing quantal response equilibria in extremely large games. Finally we show that our framework enables a new kind of scalable opponent exploitation approach.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

1809.03075

Country: North America > United States > New Jersey (0.28)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Add feedback

Unity: A General Platform for Intelligent Agents

Juliani, Arthur, Berges, Vincent-Pierre, Vckay, Esh, Gao, Yuan, Henry, Hunter, Mattar, Marwan, Lange, Danny

arXiv.org Machine LearningSep-7-2018

Recent advances in Deep Reinforcement Learning and Robotics have been driven by the presence of increasingly realistic and complex simulation environments. Many of the existing platforms, however, provide either unrealistic visuals, inaccurate physics, low task complexity, or a limited capacity for interaction among artificial agents. Furthermore, many platforms lack the ability to flexibly configure the simulation, hence turning the simulation environment into a black-box from the perspective of the learning system. Here we describe a new open source toolkit for creating and interacting with simulation environments using the Unity platform: Unity ML-Agents Toolkit. By taking advantage of Unity as a simulation platform, the toolkit enables the development of learning environments which are rich in sensory and physical complexity, provide compelling cognitive challenges, and support dynamic multi-agent interaction. We detail the platform design, communication protocol, set of example environments, and variety of training scenarios made possible via the toolkit.

machine learning, platform, reinforcement learning, (17 more...)

arXiv.org Machine Learning

1809.02627

Country:

Europe > Sweden > Skåne County > Malmö (0.04)
North America > United States > New York (0.04)
North America > United States > New Jersey (0.04)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Summary Description of the A2RD Project

Braga, Juliao, Silva, Joao Nuno, Endo, Patricia Takako, Omar, Nizam

arXiv.org Artificial IntelligenceSep-6-2018

This paper describes the Autonomous Architecture Over Restricted Domains project. It begins with the description of the context upon which the project is focused, and in the sequence describes the project and implementation models. It finish by presenting the environment conceptual model, showing where stand the components, inputs and facilities required to interact among the intelligent agents of the various implementations in their respective and restricted, routing domains (Autonomous Systems) which together make the Internet work.

artificial intelligence, controller, technical report, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.13140/RG.2.2.33386.57281

1808.09293

Country:

Europe > Portugal > Braga > Braga (0.10)
South America > Brazil > São Paulo (0.04)
South America > Brazil > Rio Grande do Norte > Natal (0.04)
(5 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.90)

Add feedback

A tutorial on Particle Swarm Optimization Clustering

Ballardini, Augusto Luis

arXiv.org Artificial IntelligenceSep-6-2018

This paper proposes a tutorial on the Data Clustering technique using the Particle Swarm Optimization approach. Following the work proposed by Merwe et al. [1] here we present an in-deep analysis of the algorithm together with a Matlab implementation and a short tutorial that explains how to modify the proposed implementation and the effect of the parameters of the original algorithm. Moreover, we provide a comparison against the results obtained using the well known K-Means approach. All the source code presented in this paper is publicly available under the GPL-v2 license.

evolutionary algorithm, machine learning, particle, (18 more...)

arXiv.org Artificial Intelligence

1809.01942

Country:

Asia > India > Tamil Nadu > Chennai (0.04)
Africa > South Africa > Gauteng > Pretoria (0.04)

Genre:

Research Report (0.64)
Instructional Material > Course Syllabus & Notes (0.55)

Industry: Education (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

Community Regularization of Visually-Grounded Dialog

Agarwal, Akshat, Gurumurthy, Swaminathan, Sharma, Vasu, Lewis, Mike, Sycara, Katia

arXiv.org Artificial IntelligenceSep-6-2018

The task of conducting visually grounded dialog involves learning goal-oriented cooperative dialog between autonomous agents who exchange information about a scene through several rounds of questions and answers in natural language. We posit that requiring artificial agents to adhere to the rules of human language, while also requiring them to maximize information exchange through dialog is an ill-posed problem. We observe that humans do not stray from a common language because they are social creatures who live in communities, and have to communicate with many people everyday, so it is far easier to stick to a common language even at the cost of some efficiency loss. Using this as inspiration, we propose and evaluate a multi-agent community-based dialog framework where each agent interacts with, and learns from, multiple agents, and show that this community-enforced regularization results in more relevant and coherent dialog (as judged by human evaluators) without sacrificing task performance (as judged by quantitative metrics).

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

1808.04359

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback

Planning with Arithmetic and Geometric Attributes

Folqué, David, Sukhbaatar, Sainbayar, Szlam, Arthur, Bruna, Joan

arXiv.org Artificial IntelligenceSep-6-2018

A desirable property of an intelligent agent is its ability to understand its environment to quickly generalize to novel tasks and compose simpler tasks into more complex ones. If the environment has geometric or arithmetic structure, the agent should exploit these for faster generalization. Building on recent work that augments the environment with user-specified attributes, we show that further equipping these attributes with the appropriate geometric and arithmetic structure brings substantial gains in sample complexity.

artificial intelligence, machine learning, transition, (15 more...)

arXiv.org Artificial Intelligence

1809.02031

Country:

North America > United States > New York (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.54)

Add feedback

A Roadmap for the Value-Loading Problem

Hoang, Lê Nguyên

arXiv.org Artificial IntelligenceSep-4-2018

We analyze the value-loading problem. This is the problem of encoding moral values into an AI agent interacting with a complex environment. Like many before, we argue that this is both a major concern and an extremely challenging problem. Solving it will likely require years, if not decades, of multidisciplinary work by teams of top scientists and experts. Given how uncertain the timeline of human-level AI research is, we thus argue that a pragmatic partial solution should be designed as soon as possible. To this end, we propose a preliminary research program. This roadmap identifies several key steps. We hope that this will allow scholars, engineers and decision-makers to better grasp the upcoming difficulties, and to foresee how they can best contribute to the global effort.

artificial intelligence, machine learning, value-loading problem, (15 more...)

arXiv.org Artificial Intelligence

1809.01036

Country:

North America > United States > New York (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Africa > Sudan (0.04)

Genre: Research Report (0.81)

Industry:

Education (0.67)
Law (0.67)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.67)
(3 more...)

Add feedback

Changing Observations in Epistemic Temporal Logic

Barrière, Aurèle, Maubert, Bastien, Murano, Aniello, Rubin, Sasha

arXiv.org Artificial IntelligenceSep-3-2018

We study dynamic changes of agents' observational power in logics of knowledge and time. We consider CTL*K, the extension of CTL* with knowledge operators, and enrich it with a new operator that models a change in an agent's way of observing the system. We extend the classic semantics of knowledge for perfect-recall agents to account for changes of observation, and we show that this new operator strictly increases the expressivity of CTL*K. We reduce the model-checking problem for our logic to that for CTL*K, which is known to be decidable. This provides a solution to the model-checking problem for our logic, but its complexity is not optimal. Indeed we provide a direct decision procedure with better complexity.

artificial intelligence, logic & formal reasoning, logic programming, (18 more...)

arXiv.org Artificial Intelligence

1805.06881

Country:

Europe > Italy > Campania > Naples (0.04)
Europe > France > Brittany > Ille-et-Vilaine > Rennes (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.78)

Add feedback