Agents
Artificial Intelligence and Humans need to work together
Artificial intelligence (AI) is becoming more influential in our everyday lives, dictating what news we see in our social media feeds, transforming how we commute to work, and even improving the odds of early disease detection. While the benefits of AI have been covered thoroughly, so have the potential negative consequences. Algorithm bias, changing employment landscape, and even the collapse of society at the hands of autonomous systems have been and continue to be debated. These are important issues, but they are missing a key component. The cascading effects of the AI revolution and how it affects our world depends largely on how well AI and humans learn to work together.
Resource-bounded Norm Monitoring In Multi-agent Systems
Norms allow system designers to specify the desired behaviour of a sociotechnical system. In this way, norms regulate what the social and technical agents in a sociotechnical system should (not) do. In this context, a vitally important question is the development of mechanisms for monitoring whether these agents comply with norms. Proposals on norm monitoring often assume that monitoring has no costs and/or that monitors have unlimited resources to observe the environment and the actions performed by agents. In this paper, we challenge this assumption and propose the first practical resource-bounded norm monitor. Our monitor is capable of selecting the resources to be deployed and use them to check norm compliance with incomplete information about the actions performed and the state of the world. We formally demonstrate the correctness and soundness of our norm monitor and study its complexity. We also demonstrate in randomised simulations and benchmark experiments that our monitor can select monitored resources effectively and efficiently, detecting more norm violations and fulfilments than other tractable optimization approaches and obtaining slightly worse results than intractable optimal approaches.
On Consensus-Optimality Trade-offs in Collaborative Deep Learning
Jiang, Zhanhong, Balu, Aditya, Hegde, Chinmay, Sarkar, Soumik
In distributed machine learning, where agents collaboratively learn from diverse private data sets, there is a fundamental tension between consensus and optimality. In this paper, we build on recent algorithmic progresses in distributed deep learning to explore various consensus-optimality trade-offs over a fixed communication topology. First, we propose the incremental consensus-based distributed SGD (i-CDSGD) algorithm, which involves multiple consensus steps (where each agent communicates information with its neighbors) within each SGD iteration. Second, we propose the generalized consensus-based distributed SGD (g-CDSGD) algorithm that enables us to navigate the full spectrum from complete consensus (all agents agree) to complete disagreement (each agent converges to individual model parameters). We analytically establish convergence of the proposed algorithms for strongly convex and nonconvex objective functions; we also analyze the momentum variants of the algorithms for the strongly convex case. We support our algorithms via numerical experiments, and demonstrate significant improvements over existing methods for collaborative deep learning.
Fourier Policy Gradients
Fellows, Matthew, Ciosek, Kamil, Whiteson, Shimon
We propose a new way of deriving policy gradient updates for reinforcement learning. Our technique, based on Fourier analysis, recasts integrals that arise with expected policy gradients as convolutions and turns them into multiplications. The obtained analytical solutions allow us to capture the low variance benefits of EPG in a broad range of settings. For the critic, we treat trigonometric and radial basis functions, two function families with the universal approximation property. The choice of policy can be almost arbitrary, including mixtures or hybrid continuous-discrete probability distributions. Moreover, we derive a general family of sample-based estimators for stochastic policy gradients, which unifies existing results on sample-based approximation. We believe that this technique has the potential to shape the next generation of policy gradient approaches, powered by analytical results.
A COP Model For Graph-Constrained Coalition Formation
Bistaffa, Filippo, Farinelli, Alessandro
We consider Graph-Constrained Coalition Formation (GCCF), a widely studied subproblem of coalition formation in which the set of valid coalitions is restricted by a graph. We propose COP-GCCF, a novel approach that models GCCF as a COP, and we solve such COP with a highly-parallel approach based on Bucket Elimination executed on the GPU, which is able to exploit the high constraint tightness of COP-GCCF. Results show that our approach outperforms state of the art algorithms (i.e., DyCE and IDPG) by at least one order of magnitude on realistic graphs, i.e., a crawl of the Twitter social graph, both in terms of runtime and memory.
OracleVoice: Top 5 Industry Early Adopters Of Autonomous Systems
Automation has already transformed industries in which complexity and performance demands must meet the challenges of scarcer resources, narrower profit margins, and expanding product volumes. Now the state of the art is beginning to move to autonomous technologies: driverless vehicles, self-tuning databases, adaptive robots, and the like. While automation involves programming a system to perform specific tasks, autonomous systems are programmed to perform automated tasks, accommodate for variation, and self-correct or self-learn with little or no human intervention. Which industries are ahead of the autonomous curve? These five industries stand out.
Verification of Distributed Epistemic Gossip Protocols
Apt, Krzysztof R., Wojtczak, Dominik
Gossip protocols aim at arriving, by means of point-to-point or group communications, at a situation in which all the agents know each other secrets. Distributed epistemic gossip protocols use as guards formulas from a simple epistemic logic and as statements calls between the agents. They are natural examples of knowledge based programs. We prove here that these protocols are implementable, that their partial correctness is decidable and that termination and two forms of fair termination of these protocols are decidable, as well. To establish these results we show that the definition of semantics and of truth of the underlying logic are decidable.
Learning to Play General Video-Games via an Object Embedding Network
Deep reinforcement learning (DRL) has proven to be an effective tool for creating general video-game AI. However most current DRL video-game agents learn end-to-end from the video-output of the game, which is superfluous for many applications and creates a number of additional problems. More importantly, directly working on pixel-based raw video data is substantially distinct from what a human player does.In this paper, we present a novel method which enables DRL agents to learn directly from object information. This is obtained via use of an object embedding network (OEN) that compresses a set of object feature vectors of different lengths into a single fixed-length unified feature vector representing the current game-state and fulfills the DRL simultaneously. We evaluate our OEN-based DRL agent by comparing to several state-of-the-art approaches on a selection of games from the GVG-AI Competition. Experimental results suggest that our object-based DRL agent yields performance comparable to that of those approaches used in our comparative study.
Object-Oriented Dynamics Predictor
Zhu, Guangxiang, Zhang, Chongjie
Generalization has been one of the major challenges for learning dynamics models in model-based reinforcement learning. However, previous work on action-conditioned dynamics prediction focuses on learning the pixel-level motion and thus does not generalize well to novel environments with different object layouts. In this paper, we present a novel object-oriented framework, called object-oriented dynamics predictor (OODP), which decomposes the environment into objects and predicts the dynamics of objects conditioned on both actions and object-to-object relations. It is an end-to-end neural network and can be trained in an unsupervised manner. To enable the generalization ability of dynamics learning, we design a novel CNN-based relation mechanism that is class-specific (rather than object-specific) and exploits the locality principle. Empirical results show that OODP significantly outperforms previous methods in terms of generalization over novel environments with various object layouts. OODP is able to learn from very few environments and accurately predict dynamics in a large number of unseen environments. In addition, OODP learns semantically and visually interpretable dynamics models.
Banking on Bots: How Virtual Agents and Robo-Advisors are Disrupting Financial Services
"To remain competitive, these large banks will have to adapt their traditional services by incorporating more robotics in banking that will attract more tech-savvy customers." The unprecedented popularity of messaging platforms, such as Facebook Messenger, WhatsApp, and WeChat, can be seen across all geographies, demographics, and psychographics. And, messaging-based first-line engagements that occur on these platforms, as conducted with chatbots, has become the premier choice for consumers engaging with brands. And, with the ongoing advancements in artificial intelligence – enabling these virtual agents to better understand and address customer requests – chatbot adoption is quickly growing across multiple industries. According to Gartner, by 2020, 85% of customer interactions will be managed without any human intervention.