Agents
Bribery and Control in Stable Marriage
Boehmer, Niclas | Bredereck, Robert (HU Berlin) | Heeger, Klaus (TU Berlin) | Niedermeier, Rolf (TU Berlin)
We initiate the study of external manipulations in Stable Marriage by considering several manipulative actions as well as several manipulation goals. For instance, one goal is to make sure that a given pair of agents is matched in a stable solution, and this may be achieved by the manipulative action of reordering some agents' preference lists. We present a comprehensive study of the computational complexity of all problems arising in this way. We find several polynomial-time solvable cases as well as NP-hard ones. For the NP-hard cases, focusing on the natural parameter "budget" (that is, the number of manipulative actions one is allowed to perform), we also conduct a parameterized complexity analysis and encounter mostly parameterized hardness results.
Quantum adaptive agents with efficient long-term memories
Elliott, Thomas J., Gu, Mile, Garner, Andrew J. P., Thompson, Jayne
Central to the success of adaptive systems is their ability to interpret signals from their environment and respond accordingly -- they act as agents interacting with their surroundings. Such agents typically perform better when able to execute increasingly complex strategies. This comes with a cost: the more information the agent must recall from its past experiences, the more memory it will need. Here we investigate the power of agents capable of quantum information processing. We uncover the most general form a quantum agent need adopt to maximise memory compression advantages, and provide a systematic means of encoding their memory states. We show these encodings can exhibit extremely favourable scaling advantages relative to memory-minimal classical agents when information must be retained about events increasingly far into the past.
A Simplicial Model for $KB4_n$: Epistemic Logic with Agents that May Die
Goubault, Eric, Ledent, Jérémy, Rajsbaum, Sergio
The standard semantics of multi-agent epistemic logic $S5$ is based on Kripke models whose accessibility relations are reflexive, symmetric and transitive. This one dimensional structure contains implicit higher-dimensional information beyond pairwise interactions, that has been formalized as pure simplicial models in previous work from the authors. Here we extend the theory to encompass all simplicial models - including the ones that are not pure. The corresponding Kripke models are those where the accessibility relation is symmetric and transitive, but might not be reflexive. This yields the epistemic logic $KB4$ which can reason about situations where some of the agents may die.
Embodied AI-Driven Operation of Smart Cities: A Concise Review
Shenavarmasouleh, Farzan, Mohammadi, Farid Ghareh, Amini, M. Hadi, Arabnia, Hamid R.
A smart city can be seen as a framework, comprised of Information and Communication Technologies (ICT). An intelligent network of connected devices that collect data with their sensors and transmit them using cloud technologies in order to communicate with other assets in the ecosystem plays a pivotal role in this framework. Maximizing the quality of life of citizens, making better use of resources, cutting costs, and improving sustainability are the ultimate goals that a smart city is after. Hence, data collected from connected devices will continuously get thoroughly analyzed to gain better insights into the services that are being offered across the city; with this goal in mind that they can be used to make the whole system more efficient. Robots and physical machines are inseparable parts of a smart city. Embodied AI is the field of study that takes a deeper look into these and explores how they can fit into real-world environments. It focuses on learning through interaction with the surrounding environment, as opposed to Internet AI which tries to learn from static datasets. Embodied AI aims to train an agent that can See (Computer Vision), Talk (NLP), Navigate and Interact with its environment (Reinforcement Learning), and Reason (General Intelligence), all at the same time. Autonomous driving cars and personal companions are some of the examples that benefit from Embodied AI nowadays. In this paper, we attempt to do a concise review of this field. We will go through its definitions, its characteristics, and its current achievements along with different algorithms, approaches, and solutions that are being used in different components of it (e.g. Vision, NLP, RL). We will then explore all the available simulators and 3D interactable databases that will make the research in this area feasible. Finally, we will address its challenges and identify its potentials for future research.
Mixed Reality using Illumination-aware Gradient Mixing in Surgical Telepresence: Enhanced Multi-layer Visualization
Puri, Nirakar, Alsadoon, Abeer, Prasad, P. W. C., Alsalami, Nada, Rashid, Tarik A.
Background and aim: Surgical telepresence using augmented perception has been applied, but mixed reality is still being researched and is only theoretical. The aim of this work is to propose a solution to improve the visualization in the final merged video by producing globally consistent videos when the intensity of illumination in the input source and target video varies. Methodology: The proposed system uses an enhanced multi-layer visualization with illumination-aware gradient mixing using Illumination Aware Video Composition algorithm. Particle Swarm Optimization Algorithm is used to find the best sample pair from foreground and background region and image pixel correlation to estimate the alpha matte. Particle Swarm Optimization algorithm helps to get the original colour and depth of the unknown pixel in the unknown region. Result: Our results showed improved accuracy caused by reducing the Mean squared Error for selecting the best sample pair for unknown region in 10 each sample for bowel, jaw and breast. The amount of this reduction is 16.48% from the state of art system. As a result, the visibility accuracy is improved from 89.4 to 97.7% which helped to clear the hand vision even in the difference of light. Conclusion: Illumination effect and alpha pixel correlation improves the visualization accuracy and produces a globally consistent composition results and maintains the temporal coherency when compositing two videos with high and inverse illumination effect. In addition, this paper provides a solution for selecting the best sampling pair for the unknown region to obtain the original colour and depth.
Learning Causal Models of Autonomous Agents using Interventions
Verma, Pulkit, Srivastava, Siddharth
One of the several obstacles in the widespread use of AI systems is the lack of requirements of interpretability that can enable a layperson to ensure the safe and reliable behavior of such systems. We extend the analysis of an agent assessment module that lets an AI system execute high-level instruction sequences in simulators and answer the user queries about its execution of sequences of actions. We show that such a primitive query-response capability is sufficient to efficiently derive a user-interpretable causal model of the system in stationary, fully observable, and deterministic settings. We also introduce dynamic causal decision networks (DCDNs) that capture the causal structure of STRIPS-like domains. A comparative analysis of different classes of queries is also presented in terms of the computational requirements needed to answer them and the efforts required to evaluate their responses to learn the correct model.
A Multi-Task Learning Framework for COVID-19 Monitoring and Prediction of PPE Demand in Community Health Centres
Molokwu, Bonaventure Chidube, Shuvo, Shaon Bhatta, Kobti, Ziad, Snowdon, Anne
Currently, the world seeks to find appropriate mitigation techniques to control and prevent the spread of the new SARS-CoV-2. In our paper herein, we present a peculiar Multi-Task Learning framework that jointly predicts the effect of SARS-CoV-2 as well as Personal-Protective-Equipment consumption in Community Health Centres for a given populace. Predicting the effect of the virus (SARS-CoV-2), via studies and analyses, enables us to understand the nature of SARS-CoV- 2 with reference to factors that promote its growth and spread. Therefore, these foster widespread awareness; and the populace can become more proactive and cautious so as to mitigate the spread of Corona Virus Disease 2019 (COVID- 19). Furthermore, understanding and predicting the demand for Personal Protective Equipment promotes the efficiency and safety of healthcare workers in Community Health Centres. Owing to the novel nature and strains of SARS-CoV-2, relatively few literature and research exist in this regard. These existing literature have attempted to solve the problem statement(s) using either Agent-based Models, Machine Learning Models, or Mathematical Models. In view of this, our work herein adds to existing literature via modeling our problem statements as Multi- Task Learning problems. Results from our research indicate that government actions and human factors are the most significant determinants that influence the spread of SARS-CoV-2.
Settling the Variance of Multi-Agent Policy Gradients
Kuba, Jakub Grudzien, Wen, Muning, Yang, Yaodong, Meng, Linghui, Gu, Shangding, Zhang, Haifeng, Mguni, David Henry, Wang, Jun
Policy gradient (PG) methods are popular reinforcement learning (RL) methods where a baseline is often applied to reduce the variance of gradient estimates. In multi-agent RL (MARL), although the PG theorem can be naturally extended, the effectiveness of multi-agent PG (MAPG) methods degrades as the variance of gradient estimates increases rapidly with the number of agents. In this paper, we offer a rigorous analysis of MAPG methods by, firstly, quantifying the contributions of the number of agents and agents' explorations to the variance of MAPG estimators. Based on this analysis, we derive the optimal baseline (OB) that achieves the minimal variance. In comparison to the OB, we measure the excess variance of existing MARL algorithms such as vanilla MAPG and COMA. Considering using deep neural networks, we also propose a surrogate version of OB, which can be seamlessly plugged into any existing PG methods in MARL. On benchmarks of Multi-Agent MuJoCo and StarCraft challenges, our OB technique effectively stabilises training and improves the performance of multi-agent PPO and COMA algorithms by a significant margin.
On Accelerating Distributed Convex Optimizations
Chakrabarti, Kushal, Gupta, Nirupam, Chopra, Nikhil
This paper studies a distributed multi-agent convex optimization problem. The system comprises multiple agents in this problem, each with a set of local data points and an associated local cost function. The agents are connected to a server, and there is no inter-agent communication. The agents' goal is to learn a parameter vector that optimizes the aggregate of their local costs without revealing their local data points. In principle, the agents can solve this problem by collaborating with the server using the traditional distributed gradient-descent method. However, when the aggregate cost is ill-conditioned, the gradient-descent method (i) requires a large number of iterations to converge, and (ii) is highly unstable against process noise. We propose an iterative pre-conditioning technique to mitigate the deleterious effects of the cost function's conditioning on the convergence rate of distributed gradient-descent. Unlike the conventional pre-conditioning techniques, the pre-conditioner matrix in our proposed technique updates iteratively to facilitate implementation on the distributed network. In the distributed setting, we provably show that the proposed algorithm converges linearly with an improved rate of convergence than the traditional and adaptive gradient-descent methods. Additionally, for the special case when the minimizer of the aggregate cost is unique, our algorithm converges superlinearly. We demonstrate our algorithm's superior performance compared to prominent distributed algorithms for solving real logistic regression problems and emulating neural network training via a noisy quadratic model, thereby signifying the proposed algorithm's efficiency for distributively solving non-convex optimization. Moreover, we empirically show that the proposed algorithm results in faster training without compromising the generalization performance.
On the Opportunities and Risks of Foundation Models
Bommasani, Rishi, Hudson, Drew A., Adeli, Ehsan, Altman, Russ, Arora, Simran, von Arx, Sydney, Bernstein, Michael S., Bohg, Jeannette, Bosselut, Antoine, Brunskill, Emma, Brynjolfsson, Erik, Buch, Shyamal, Card, Dallas, Castellon, Rodrigo, Chatterji, Niladri, Chen, Annie, Creel, Kathleen, Davis, Jared Quincy, Demszky, Dora, Donahue, Chris, Doumbouya, Moussa, Durmus, Esin, Ermon, Stefano, Etchemendy, John, Ethayarajh, Kawin, Fei-Fei, Li, Finn, Chelsea, Gale, Trevor, Gillespie, Lauren, Goel, Karan, Goodman, Noah, Grossman, Shelby, Guha, Neel, Hashimoto, Tatsunori, Henderson, Peter, Hewitt, John, Ho, Daniel E., Hong, Jenny, Hsu, Kyle, Huang, Jing, Icard, Thomas, Jain, Saahil, Jurafsky, Dan, Kalluri, Pratyusha, Karamcheti, Siddharth, Keeling, Geoff, Khani, Fereshte, Khattab, Omar, Kohd, Pang Wei, Krass, Mark, Krishna, Ranjay, Kuditipudi, Rohith, Kumar, Ananya, Ladhak, Faisal, Lee, Mina, Lee, Tony, Leskovec, Jure, Levent, Isabelle, Li, Xiang Lisa, Li, Xuechen, Ma, Tengyu, Malik, Ali, Manning, Christopher D., Mirchandani, Suvir, Mitchell, Eric, Munyikwa, Zanele, Nair, Suraj, Narayan, Avanika, Narayanan, Deepak, Newman, Ben, Nie, Allen, Niebles, Juan Carlos, Nilforoshan, Hamed, Nyarko, Julian, Ogut, Giray, Orr, Laurel, Papadimitriou, Isabel, Park, Joon Sung, Piech, Chris, Portelance, Eva, Potts, Christopher, Raghunathan, Aditi, Reich, Rob, Ren, Hongyu, Rong, Frieda, Roohani, Yusuf, Ruiz, Camilo, Ryan, Jack, Ré, Christopher, Sadigh, Dorsa, Sagawa, Shiori, Santhanam, Keshav, Shih, Andy, Srinivasan, Krishnan, Tamkin, Alex, Taori, Rohan, Thomas, Armin W., Tramèr, Florian, Wang, Rose E., Wang, William, Wu, Bohan, Wu, Jiajun, Wu, Yuhuai, Xie, Sang Michael, Yasunaga, Michihiro, You, Jiaxuan, Zaharia, Matei, Zhang, Michael, Zhang, Tianyi, Zhang, Xikun, Zhang, Yuhui, Zheng, Lucia, Zhou, Kaitlyn, Liang, Percy
AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations). Though foundation models are based on standard deep learning and transfer learning, their scale results in new emergent capabilities,and their effectiveness across so many tasks incentivizes homogenization. Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream. Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration commensurate with their fundamentally sociotechnical nature.