Goto

Collaborating Authors

 Agents


Non-Bayesian Social Learning with Uncertain Models

arXiv.org Artificial Intelligence

Non-Bayesian social learning theory provides a framework that models distributed inference for a group of agents interacting over a social network. In this framework, each agent iteratively forms and communicates beliefs about an unknown state of the world with their neighbors using a learning rule. Existing approaches assume agents have access to precise statistical models (in the form of likelihoods) for the state of the world. However in many situations, such models must be learned from finite data. We propose a social learning rule that takes into account uncertainty in the statistical models using second-order probabilities. Therefore, beliefs derived from uncertain models are sensitive to the amount of past evidence collected for each hypothesis. We characterize how well the hypotheses can be tested on a social network, as consistent or not with the state of the world. We explicitly show the dependency of the generated beliefs with respect to the amount of prior evidence. Moreover, as the amount of prior evidence goes to infinity, learning occurs and is consistent with traditional social learning theory.


Signal Instructed Coordination in Team Competition

arXiv.org Artificial Intelligence

Most existing models of multi-agent reinforcement learning (MARL) adopt centralized training with decentralized execution framework. We demonstrate that the decentralized execution scheme restricts agents' capacity to find a better joint policy in team competition games, where each team of agents share the common rewards and cooperate to compete against other teams. To resolve this problem, we propose Signal Instructed Coordination (SIC), a novel coordination module that can be integrated with most existing models. SIC casts a common signal sampled from a pre-defined distribution to team members, and adopts an information-theoretic regularization to encourage agents to exploit in learning the instruction of centralized signals. Our experiments show that SIC can consistently improve team performance over well-recognized MARL models on matrix games and predator-prey games.


Velas - Virtual Expanding Learning Autonomous System

#artificialintelligence

Alex Lightman is the first columnist for ICO Crowd magazine, with 35 articles, an Amazon.com He has authored 14 crypto white papers. He has served as an advisor to 20 Blockchain companies and speaks around the world on "solving big problems with Blockchain, AI and IoT", "CryptoHistory 2009-2050", and "Visionary Blockchain Projects". Lightman was the founder and CEO of Token Communities, and became CTO after the acquisition and name change to Sakthi Global. He leads Kingsland's Executive Education program and the 16 hour two day Blockchain program he authored and teaches has received 100% 5 out of 5 star ratings from participants.


Evolving Order and Chaos: Comparing Particle Swarm Optimization and Genetic Algorithms for Global Coordination of Cellular Automata

arXiv.org Artificial Intelligence

Evolving Order and Chaos: Comparing Particle Swarm Optimization and Genetic Algorithms for Global Coordination of Cellular Automata Anthony D. Rhodes Portland State University Abstract -- We apply two evolutionary search algorithms: Particle Swarm Optimization (PSO) and Genetic Algorithms (GAs) to the design of Cellular Automata (CA) that can perform computational tasks requiring global coordination. In particular, we compare search efficiency for PSO and GAs applied to both the density classification problem and to the novel generation of "chaotic" CA. Our work furthermore introduces a new variant of PSO, the Binary Global-Local PSO (BGL-PSO). I. INTRODUCTION: CELLULAR AUTOMATA Cellular Automata (CA) are discrete, spatially-extended dynamical systems consisting of cells, each of which contains a finite state machine. Given an initial configuration of cells, CA evolve over time by performing computations according to local rules.


DataWorkshop Club Conf 2019 Machine Learning Conference Europe

#artificialintelligence

Recent years have seen a rising interest in developing AI algorithms for real world big data domains ranging from autonomous cars to personalized assistants. At the core of these algorithms are architectures that combine deep neural networks, for approximating the underlying multidimensional state-spaces, with reinforcement learning, for controlling agents that learn to operate in said state-spaces towards achieving a given objective. The talk will first outline notable past and future efforts in deep reinforcement learning as well as identify fundamental problems that this technology has been struggling to overcome. Towards mitigating these problems (and open up an alternative path to general artificial intelligence), I will then summarize a brain computing model of intelligence, rooted in the latest findings in neuroscience. The talk will conclude with an overview of the recent research efforts in the field of multi-agent systems, to provide the future teams of humans and agents with the necessary tools that allow them to safely co-exist.


Automatic Financial Trading Agent for Low-risk Portfolio Management using Deep Reinforcement Learning

arXiv.org Artificial Intelligence

The autonomous trading agent is one of the most actively studied areas of artificial intelligence to solve the capital market portfolio management problem. The two primary goals of the portfolio management problem are maximizing profit and restrainting risk. However, most approaches to this problem solely take account of maximizing returns. Therefore, this paper proposes a deep reinforcement learning based trading agent that can manage the portfolio considering not only profit maximization but also risk restraint. We also propose a new target policy to allow the trading agent to learn to prefer low-risk actions. The new target policy can be reflected in the update by adjusting the greediness for the optimal action through the hyper parameter. The proposed trading agent verifies the performance through the data of the cryptocurrency market. The Cryptocurrency market is the best test-ground for testing our trading agents because of the huge amount of data accumulated every minute and the market volatility is extremely large. As a experimental result, during the test period, our agents achieved a return of 1800% and provided the least risky investment strategy among the existing methods. And, another experiment shows that the agent can maintain robust generalized performance even if market volatility is large or training period is short.


Quantized Fisher Discriminant Analysis

arXiv.org Machine Learning

This paper proposes a new subspace learning method, named Quantized Fisher Discriminant Analysis (QFDA), which makes use of both machine learning and information theory. There is a lack of literature for combination of machine learning and information theory and this paper tries to tackle this gap. QFDA finds a subspace which discriminates the uniformly quantized images in the Discrete Cosine Transform (DCT) domain at least as well as discrimination of non-quantized images by Fisher Discriminant Analysis (FDA) while the images have been compressed. This helps the user to throw away the original images and keep the compressed images instead without noticeable loss of classification accuracy. We propose a cost function whose minimization can be interpreted as rate-distortion optimization in information theory. We also propose quantized Fisherfaces for facial analysis in QFDA.


Calibrating Wayfinding Decisions in Pedestrian Simulation Models: The Entropy Map

arXiv.org Artificial Intelligence

This paper presents entropy maps, an approach to describing and visualising uncertainty among alternative potential movement intentions in pedestrian simulation models. In particular, entropy maps show the instantaneous level of randomness in decisions of a pedestrian agent situated in a specific point of the simulated environment with an heatmap approach. Experimental results highlighting the relevance of this tool supporting modelers are provided and discussed. Keywords: Data Visualization · Modelling and Simulation · Stochastic Models. 1 Introduction & Related Works Computer simulation of complex systems often employs stochastic models: implied randomness is a way to account for aspects that are potentially relevant to the overall phenomenon but cannot be explicitly considered to keep the model and the modelling phase manageable [3]. Pedestrian and crowd behaviour simulation, for instance, requires considering different kinds of decisions, taken at distinct levels of abstraction, employing heterogeneous information and knowledge about the environment, from path planning [7] to the regulation of distance from other pedestrians and obstacles present in the environment[2,8]. Exploring implications of randomness and situations of indecision, irresolution in case of choice among alternative lines of behaviour such as the exits from an environment in an emergency situation [10], can be a very significant step, with important implications of overall simulation results. This paper presents an approach to describing and visualising uncertainty among alternative potential movement intentions in pedestrian simulation models. As in the framework of probability theory [12], we use the concept of entropy to provide a measure of uncertainty over the simulated space The paper, first of all, describes a general decision making model for supporting wayfinding, which comes from previous work by the authors [8,7].


Multi-Objective Multi-Agent Decision Making: A Utility-based Analysis and Survey

arXiv.org Artificial Intelligence

The majority of multi-agent system (MAS) implementations aim to optimise agents' policies with respect to a single objective, despite the fact that many real-world problem domains are inherently multi-objective in nature. Multi-objective multi-agent systems (MOMAS) explicitly consider the possible trade-offs between conflicting objective functions. We argue that, in MOMAS, such compromises should be analysed on the basis of the utility that these compromises have for the users of a system. As is standard in multi-objective optimisation, we model the user utility using utility functions that map value or return vectors to scalar values. This approach naturally leads to two different optimisation criteria: expected scalarised returns (ESR) and scalarised expected returns (SER). We develop a new taxonomy which classifies multi-objective multi-agent decision making settings, on the basis of the reward structures, and which and how utility functions are applied. This allows us to offer a structured view of the field, to clearly delineate the current state-of-the-art in multi-objective multi-agent decision making approaches and to identify promising directions for future research. Starting from the execution phase, in which the selected policies are applied and the utility for the users is attained, we analyse which solution concepts apply to the different settings in our taxonomy. Furthermore, we define and discuss these solution concepts under both ESR and SER optimisation criteria. We conclude with a summary of our main findings and a discussion of many promising future research directions in multi-objective multi-agent systems.


A Reinforcement Learning Based Approach for Joint Multi-Agent Decision Making

arXiv.org Artificial Intelligence

Reinforcement Learning (RL) is being increasingly applied to optimize complex functions that may have a stochastic component. RL is extended to multi-agent systems to find policies to optimize systems that require agents to coordinate or to compete under the umbrella of Multi-Agent RL (MARL). A crucial factor in the success of RL is that the optimization problem is represented as the expected sum of rewards, which allows the use of backward induction for the solution. However, many real-world problems require a joint objective that is non-linear and dynamic programming cannot be applied directly. For example, in a resource allocation problem, one of the objective is to maximize long-term fairness among the users. This paper addresses and formalizes the problem of joint objective optimization, where not only the sum of rewards of each agent but a function of the sum of rewards of each agent needs to be optimized. The proposed algorithms at the centralized controller aims to learn the policy to dictate the actions for each agent such that the joint objective function based on average per step rewards of each agent is maximized. We propose both model-based and model-free algorithms, where the model-based algorithm is shown to achieve $\Tilde{O}(\sqrt{\frac{K}{T}})$ regret bound for $K$ agents over a time-horizon $T$, and the model-free algorithm can be implemented using deep neural networks. Further, using fairness in cellular base-station scheduling as an example, the proposed algorithms are shown to significantly outperform the state-of-the-art approaches.