Goto

Collaborating Authors

 Agents


Applications of Multi-Agent Reinforcement Learning in Future Internet: A Comprehensive Survey

arXiv.org Artificial Intelligence

Future Internet involves several emerging technologies such as 5G and beyond 5G networks, vehicular networks, unmanned aerial vehicle (UAV) networks, and Internet of Things (IoTs). Moreover, future Internet becomes heterogeneous and decentralized with a large number of involved network entities. Each entity may need to make its local decision to improve the network performance under dynamic and uncertain network environments. Standard learning algorithms such as single-agent Reinforcement Learning (RL) or Deep Reinforcement Learning (DRL) have been recently used to enable each network entity as an agent to learn an optimal decision-making policy adaptively through interacting with the unknown environments. However, such an algorithm fails to model the cooperations or competitions among network entities, and simply treats other entities as a part of the environment that may result in the non-stationarity issue. Multi-agent Reinforcement Learning (MARL) allows each network entity to learn its optimal policy by observing not only the environments, but also other entities' policies. As a result, MARL can significantly improve the learning efficiency of the network entities, and it has been recently used to solve various issues in the emerging networks. In this paper, we thus review the applications of MARL in the emerging networks. In particular, we provide a tutorial of MARL and a comprehensive survey of applications of MARL in next generation Internet. In particular, we first introduce single-agent RL and MARL. Then, we review a number of applications of MARL to solve emerging issues in future Internet. The issues consist of network access, transmit power control, computation offloading, content caching, packet routing, trajectory design for UAV-aided networks, and network security issues.


Fault-Tolerant Federated Reinforcement Learning with Theoretical Guarantee

arXiv.org Artificial Intelligence

The growing literature of Federated Learning (FL) has recently inspired Federated Reinforcement Learning (FRL) to encourage multiple agents to federatively build a better decision-making policy without sharing raw trajectories. Despite its promising applications, existing works on FRL fail to I) provide theoretical analysis on its convergence, and II) account for random system failures and adversarial attacks. Towards this end, we propose the first FRL framework the convergence of which is guaranteed and tolerant to less than half of the participating agents being random system failures or adversarial attackers. We prove that the sample efficiency of the proposed framework is guaranteed to improve with the number of agents and is able to account for such potential failures or attacks. All theoretical results are empirically verified on various RL benchmark tasks.


Iterative Teacher-Aware Learning

arXiv.org Artificial Intelligence

In human pedagogy, teachers and students can interact adaptively to maximize communication efficiency. The teacher adjusts her teaching method for different students, and the student, after getting familiar with the teacher's instruction mechanism, can infer the teacher's intention to learn faster. Recently, the benefits of integrating this cooperative pedagogy into machine concept learning in discrete spaces have been proved by multiple works. However, how cooperative pedagogy can facilitate machine parameter learning hasn't been thoroughly studied. In this paper, we propose a gradient optimization based teacher-aware learner who can incorporate teacher's cooperative intention into the likelihood function and learn provably faster compared with the naive learning algorithms used in previous machine teaching works. We give theoretical proof that the iterative teacher-aware learning (ITAL) process leads to local and global improvements. We then validate our algorithms with extensive experiments on various tasks including regression, classification, and inverse reinforcement learning using synthetic and real data. We also show the advantage of modeling teacher-awareness when agents are learning from human teachers.


Towards Realistic Market Simulations: a Generative Adversarial Networks Approach

arXiv.org Artificial Intelligence

Simulated environments are increasingly used by trading firms and investment banks to evaluate trading strategies before approaching real markets. Backtesting, a widely used approach, consists of simulating experimental strategies while replaying historical market scenarios. Unfortunately, this approach does not capture the market response to the experimental agents' actions. In contrast, multi-agent simulation presents a natural bottom-up approach to emulating agent interaction in financial markets. It allows to set up pools of traders with diverse strategies to mimic the financial market trader population, and test the performance of new experimental strategies. Since individual agent-level historical data is typically proprietary and not available for public use, it is difficult to calibrate multiple market agents to obtain the realism required for testing trading strategies. To addresses this challenge we propose a synthetic market generator based on Conditional Generative Adversarial Networks (CGANs) trained on real aggregate-level historical data. A CGAN-based "world" agent can generate meaningful orders in response to an experimental agent. We integrate our synthetic market generator into ABIDES, an open source simulator of financial markets. By means of extensive simulations we show that our proposal outperforms previous work in terms of stylized facts reflecting market responsiveness and realism.


Decomposed Inductive Procedure Learning

arXiv.org Artificial Intelligence

Recent advances in machine learning have made it possible to train artificially intelligent agents that perform with super-human accuracy on a great diversity of complex tasks. However, the process of training these capabilities often necessitates millions of annotated examples -- far more than humans typically need in order to achieve a passing level of mastery on similar tasks. Thus, while contemporary methods in machine learning can produce agents that exhibit super-human performance, their rate of learning per opportunity in many domains is decidedly lower than human-learning. In this work we formalize a theory of Decomposed Inductive Procedure Learning (DIPL) that outlines how different forms of inductive symbolic learning can be used in combination to build agents that learn educationally relevant tasks such as mathematical, and scientific procedures, at a rate similar to human learners. We motivate the construction of this theory along Marr's concepts of the computational, algorithmic, and implementation levels of cognitive modeling, and outline at the computational-level six learning capacities that must be achieved to accurately model human learning. We demonstrate that agents built along the DIPL theory are amenable to satisfying these capacities, and demonstrate, both empirically and theoretically, that DIPL enables the creation of agents that exhibit human-like learning performance.


Optimal Auction Design for the Gradual Procurement of Strategic Service Provider Agents

arXiv.org Artificial Intelligence

We consider an outsourcing problem where a software agent procures multiple services from providers with uncertain reliabilities to complete a computational task before a strict deadline. The service consumer requires a procurement strategy that achieves the optimal balance between success probability and invocation cost. However, the service providers are self-interested and may misrepresent their private cost information if it benefits them. For such settings, we design a novel procurement auction that provides the consumer with the highest possible revenue, while giving sufficient incentives to providers to tell the truth about their costs. This auction creates a contingent plan for gradual service procurement that suggests recruiting a new provider only when the success probability of the already hired providers drops below a time-dependent threshold. To make this auction incentive compatible, we propose a novel weighted threshold payment scheme which pays the minimum among all truthful mechanisms. Using the weighted payment scheme, we also design a low-complexity near-optimal auction that reduces the computational complexity of the optimal mechanism by 99% with only marginal performance loss (less than 1%). We demonstrate the effectiveness and strength of our proposed auctions through both game theoretical and numerical analysis. The experiment results confirm that the proposed auctions exhibit 59% improvement in performance over the current state-of-the-art, by increasing success probability up to 79% and reducing invocation cost by up to 11%.


Complete Agent-driven Model-based System Testing for Autonomous Systems

arXiv.org Artificial Intelligence

In this position paper, a novel approach to testing complex autonomous transportation systems (ATS) in the automotive, avionic, and railway domains is described. It is intended to mitigate some of the most critical problems regarding verification and validation (V&V) effort for ATS. V&V is known to become infeasible for complex ATS, when using conventional methods only. The approach advocated here uses complete testing methods on the module level, because these establish formal proofs for the logical correctness of the software. Having established logical correctness, system-level tests are performed in simulated cloud environments and on the target system. To give evidence that 'sufficiently many' system tests have been performed with the target system, a formally justified coverage criterion is introduced. To optimise the execution of very large system test suites, we advocate an online testing approach where multiple tests are executed in parallel, and test steps are identified on-the-fly. The coordination and optimisation of these executions is achieved by an agent-based approach. Each aspect of the testing approach advocated here is shown to either be consistent with existing standards for development and V&V of safety-critical transportation systems, or it is justified why it should become acceptable in future revisions of the applicable standards.


Foresight of Graph Reinforcement Learning Latent Permutations Learnt by Gumbel Sinkhorn Network

arXiv.org Artificial Intelligence

Vital importance has necessity to be attached to cooperation in multi-agent environments, as a result of which some reinforcement learning algorithms combined with graph neural networks have been proposed to understand the mutual interplay between agents. However, highly complicated and dynamic multi-agent environments require more ingenious graph neural networks, which can comprehensively represent not only the graph topology structure but also evolution process of the structure due to agents emerging, disappearing and moving. To tackle these difficulties, we propose Gumbel Sinkhorn graph attention reinforcement learning, where a graph attention network highly represents the underlying graph topology structure of the multi-agent environment, and can adapt to the dynamic topology structure of graph better with the help of Gumbel Sinkhorn network by learning latent permutations. Empirically, simulation results show how our proposed graph reinforcement learning methodology outperforms existing methods in the PettingZoo multi-agent environment by learning latent permutations.


Using Psychological Characteristics of Situations for Social Situation Comprehension in Support Agents

arXiv.org Artificial Intelligence

Support agents that help users in their daily lives need to take into account not only the user's characteristics, but also the social situation of the user. Existing work on including social context uses some type of situation cue as an input to information processing techniques in order to assess the expected behavior of the user. However, research shows that it is important to also determine the meaning of a situation, a step which we refer to as social situation comprehension. We propose using psychological characteristics of situations, which have been proposed in social science for ascribing meaning to situations, as the basis for social situation comprehension. Using data from user studies, we evaluate this proposal from two perspectives. First, from a technical perspective, we show that psychological characteristics of situations can be used as input to predict the priority of social situations, and that psychological characteristics of situations can be predicted from the features of a social situation. Second, we investigate the role of the comprehension step in human-machine meaning making. We show that psychological characteristics can be successfully used as a basis for explanations given to users about the decisions of an agenda management personal assistant agent.


Gapoera: Application Programming Interface for AI Environment of Indonesian Board Game

arXiv.org Artificial Intelligence

Currently, the development of computer games has shown a tremendous surge. The ease and speed of internet access today have also influenced the development of computer games, especially computer games that are played online. Internet technology has allowed computer games to be played in multiplayer mode. Interaction between players in a computer game can be built in several ways, one of which is by providing balanced opponents. Opponents can be developed using intelligent agents. On the other hand, research on developing intelligent agents is also growing rapidly. In computer game development, one of the easiest ways to measure the performance of an intelligent agent is to develop a virtual environment that allows the intelligent agent to interact with other players. In this research, we try to develop an intelligent agent and virtual environment for the board game. To be easily accessible, the intelligent agent and virtual environment are then developed into an Application Programming Interface (API) service called Gapoera API. The Gapoera API service that is built is expected to help game developers develop a game without having to think much about the artificial intelligence that will be embedded in the game. This service provides a basic multilevel intelligent agent that can provide users with playing board games commonly played in Indonesia. Although the Gapoera API can be used for various types of games, in this paper, we will focus on the discussion on a popular traditional board game in Indonesia, namely Mancala. The test results conclude that the multilevel agent concept developed has worked as expected. On the other hand, the development of the Gapoera API service has also been successfully accessed on several game platforms.