Goto

Collaborating Authors

 Agents


Playing the Beer Game Using Reinforcement Learning

@machinelearnbot

The beer game is a widely used in-class game that is played in supply chain management classes to demonstrate a phenomenon known as the bullwhip effect. The game consists of a serial supply chain network with four players--a retailer, a wholesaler, a distributor, and a manufacturer. In each period of the game, the retailer experiences a random demand from customers. Then the four players each decide how much inventory of "beer" to order. The retailer orders from the wholesaler, the wholesaler orders from the distributor, the distributor from the manufacturer, and the manufacturer orders from an external supplier that is not a player in the game.


Convergence analysis of belief propagation for pairwise linear Gaussian models

arXiv.org Machine Learning

Gaussian belief propagation (BP) has been widely used for distributed inference in large-scale networks such as the smart grid, sensor networks, and social networks, where local measurements/observations are scattered over a wide geographical area. One particular case is when two neighboring agents share a common observation. For example, to estimate voltage in the direct current (DC) power flow model, the current measurement over a power line is proportional to the voltage difference between two neighboring buses. When applying the Gaussian BP algorithm to this type of problem, the convergence condition remains an open issue. In this paper, we analyze the convergence properties of Gaussian BP for this pairwise linear Gaussian model. We show analytically that the updating information matrix converges at a geometric rate to a unique positive definite matrix with arbitrary positive semidefinite initial value and further provide the necessary and sufficient convergence condition for the belief mean vector to the optimal estimate.


Model Training with Yufeng Guo

#artificialintelligence

Play in new window Download Multiagent systems involve the interaction of autonomous agents that may be acting independently or in collaboration with each other. Examples of these systems include financial markets, robot soccer matches, and automated warehouses. Today's guest Peter Stone is a professor of computer science who specializies in multiagent systems and robotics.


Gaussian Process Decentralized Data Fusion Meets Transfer Learning in Large-Scale Distributed Cooperative Perception

arXiv.org Machine Learning

This paper presents novel Gaussian process decentralized data fusion algorithms exploiting the notion of agent-centric support sets for distributed cooperative perception of large-scale environmental phenomena. To overcome the limitations of scale in existing works, our proposed algorithms allow every mobile sensing agent to choose a different support set and dynamically switch to another during execution for encapsulating its own data into a local summary that, perhaps surprisingly, can still be assimilated with the other agents' local summaries (i.e., based on their current choices of support sets) into a globally consistent summary to be used for predicting the phenomenon. To achieve this, we propose a novel transfer learning mechanism for a team of agents capable of sharing and transferring information encapsulated in a summary based on a support set to that utilizing a different support set with some loss that can be theoretically bounded and analyzed. To alleviate the issue of information loss accumulating over multiple instances of transfer learning, we propose a new information sharing mechanism to be incorporated into our algorithms in order to achieve memory-efficient lazy transfer learning. Empirical evaluation on real-world datasets show that our algorithms outperform the state-of-the-art methods.


Random gradient extrapolation for distributed and stochastic optimization

arXiv.org Machine Learning

In this paper, we consider a class of finite-sum convex optimization problems defined over a distributed multiagent network with $m$ agents connected to a central server. In particular, the objective function consists of the average of $m$ ($\ge 1$) smooth components associated with each network agent together with a strongly convex term. Our major contribution is to develop a new randomized incremental gradient algorithm, namely random gradient extrapolation method (RGEM), which does not require any exact gradient evaluation even for the initial point, but can achieve the optimal ${\cal O}(\log(1/\epsilon))$ complexity bound in terms of the total number of gradient evaluations of component functions to solve the finite-sum problems. Furthermore, we demonstrate that for stochastic finite-sum optimization problems, RGEM maintains the optimal ${\cal O}(1/\epsilon)$ complexity (up to a certain logarithmic factor) in terms of the number of stochastic gradient computations, but attains an ${\cal O}(\log(1/\epsilon))$ complexity in terms of communication rounds (each round involves only one agent). It is worth noting that the former bound is independent of the number of agents $m$, while the latter one only linearly depends on $m$ or even $\sqrt m$ for ill-conditioned problems. To the best of our knowledge, this is the first time that these complexity bounds have been obtained for distributed and stochastic optimization problems. Moreover, our algorithms were developed based on a novel dual perspective of Nesterov's accelerated gradient method.


"Dave...I can assure you...that it's going to be all right..." -- A definition, case for, and survey of algorithmic assurances in human-autonomy trust relationships

arXiv.org Machine Learning

As technology becomes more advanced, those who design, use and are otherwise affected by it want to know that it will perform correctly, and understand why it does what it does, and how to use it appropriately. In essence they want to be able to trust the systems that are being designed. In this survey we present assurances that are the method by which users can understand how to trust autonomous systems. Trust between humans and autonomy is reviewed, and the implications for the design of assurances are highlighted. A survey of existing research related to assurances is presented. Much of the surveyed research originates from fields such as interpretable, comprehensible, transparent, and explainable machine learning, as well as human-computer interaction, human-robot interaction, and e-commerce. Several key ideas are extracted from this work in order to refine the definition of assurances. The design of assurances is found to be highly dependent not only on the capabilities of the autonomous system, but on the characteristics of the human user, and the appropriate trust-related behaviors. Several directions for future research are identified and discussed.


Multi-Advisor Reinforcement Learning

arXiv.org Artificial Intelligence

We consider tackling a single-agent RL problem by distributing it to $n$ learners. These learners, called advisors, endeavour to solve the problem from a different focus. Their advice, taking the form of action values, is then communicated to an aggregator, which is in control of the system. We show that the local planning method for the advisors is critical and that none of the ones found in the literature is flawless: the egocentric planning overestimates values of states where the other advisors disagree, and the agnostic planning is inefficient around danger zones. We introduce a novel approach called empathic and discuss its theoretical aspects. We empirically examine and validate our theoretical findings on a fruit collection task.


Brennus Analytics: finding the right price ParisTech Entrepreneurs

#artificialintelligence

Setting the price of a product can be a real headache for businesses. It is however a crucial stage which can determine the success or failure of an entire commercial strategy. If the price is too high, customers won't buy the product. Too low, and the obtained margin is too weak to guarantee sufficient revenues. In order to help businesses find the right price, the start-up Brennus Analytics, incubated at ParisTech Entrepreneurs, proposes a software making artificial intelligence technology accessible for businesses.


Has AI changed the SEO industry for better or worse?

#artificialintelligence

With Google turning to artificial intelligence to power its flagship search engine business, has the SEO industry been left in the dust? The old ways of testing and measuring are becoming antiquated, and industry insiders are scrambling to understand something new -- something which is more advanced than their backgrounds typically permit. The fact is, even Google engineers are having a hard time explaining how Google works anymore. With this in mind, is artificial intelligence changing the SEO industry for better or worse? And has Google's once-understood algorithm become a "runaway algorithm?"


Cooperative Group Optimization System

#artificialintelligence

The cooperative group optimization (CGO) system consists of a group of intelligent agents cooperating with their peers in a sharing environment for realizing a common intention of finding high-quality solution(s) based on the landscape representation of an optimization task. CGO has also been applied on numerical optimization problem (NOP) to find solutions in high-dimensional nonlinear continuous space. Some algorithms, including Dissipative Particle Swarm Optimization (DPSO), Differential Evolution (DE), Social Cognitive Optimization (SCO), Genetic Algorithms (GA), and Electromagnetism-like Mechanism (EM) Heuristic, etc, and their hybrids (e.g., DEPSO), could be easily implemented into CGO. Both SCO and DEPSO have been incorporated into the NLPSolver extension of Calc in Apache Office. DEPSO was used for finding narrow admissible k-tuples.