Agents
Core Challenges in Embodied Vision-Language Planning
Francis, Jonathan (Carnegie Mellon University) | Kitamura, Nariaki (Carnegie Mellon University) | Labelle, Felix (Carnegie Mellon University) | Lu, Xiaopeng (Carnegie Mellon University) | Navarro, Ingrid (Carnegie Mellon University) | Oh, Jean
Recent advances in the areas of multimodal machine learning and artificial intelligence (AI) have led to the development of challenging tasks at the intersection of Computer Vision, Natural Language Processing, and Embodied AI. Whereas many approaches and previous survey pursuits have characterised one or two of these dimensions, there has not been a holistic analysis at the center of all three. Moreover, even when combinations of these topics are considered, more focus is placed on describing, e.g., current architectural methods, as opposed to also illustrating high-level challenges and opportunities for the field. In this survey paper, we discuss Embodied Vision-Language Planning (EVLP) tasks, a family of prominent embodied navigation and manipulation problems that jointly use computer vision and natural language. We propose a taxonomy to unify these tasks and provide an in-depth analysis and comparison of the new and current algorithmic approaches, metrics, simulated environments, as well as the datasets used for EVLP tasks. Finally, we present the core challenges that we believe new EVLP works should seek to address, and we advocate for task construction that enables model generalizability and furthers real-world deployment.
Implementing Particle Swarm Optimization in Tensorflow
Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. It's free, we don't spam, and we never share your email address.
Private and Byzantine-Proof Cooperative Decision-Making
Dubey, Abhimanyu, Pentland, Alex
The cooperative bandit problem is a multi-agent decision problem involving a group of agents that interact simultaneously with a multi-armed bandit, while communicating over a network with delays. The central idea in this problem is to design algorithms that can efficiently leverage communication to obtain improvements over acting in isolation. In this paper, we investigate the stochastic bandit problem under two settings - (a) when the agents wish to make their communication private with respect to the action sequence, and (b) when the agents can be byzantine, i.e., they provide (stochastically) incorrect information. For both these problem settings, we provide upper-confidence bound algorithms that obtain optimal regret while being (a) differentially-private and (b) tolerant to byzantine agents. Our decentralized algorithms require no information about the network of connectivity between agents, making them scalable to large dynamic systems. We test our algorithms on a competitive benchmark of random graphs and demonstrate their superior performance with respect to existing robust algorithms. We hope that our work serves as an important step towards creating distributed decision-making systems that maintain privacy.
Is diversity the key to collaboration? New AI research suggests so
As artificial intelligence gets better at performing tasks once solely in the hands of humans, like driving cars, many see teaming intelligence as a next frontier. In this future, humans and AI are true partners in high-stakes jobs, such as performing complex surgery or defending from missiles. But before teaming intelligence can take off, researchers must overcome a problem that corrodes cooperation: humans often do not like or trust their AI partners. MIT Lincoln Laboratory researchers have found that training an AI model with mathematically "diverse" teammates improves its ability to collaborate with other AI it has never worked with before, in the card game Hanabi. Moreover, both Facebook and Google's DeepMind concurrently published independent work that also infused diversity into training to improve outcomes in human-AI collaborative games.
Understanding Agent Environment in AI - KDnuggets
Before starting the article, it is important to understand what an agent in AI is. The agent is basically an entity that helps the AI, machine learning, or deep reinforcement learning to make a decision or trigger the AI to make a decision. In terms of software, it is defined as the entity which can take decisions and can make different decisions on the basis of changes in the environment, or after getting input from the external environment. In simpler words, the quick agent perceives external change and acts against it the better the results obtained from the model. Hence the role of the agent is always very important in artificial intelligence, machine learning, and deep learning.
Agent-based model using GPS analysis for infection spread and inhibition mechanism of SARS-CoV-2 in Tokyo
Murakami, Taishu, Sakuragi, Shunsuke, Deguchi, Hiroshi, Nakata, Masaru
Analyzing the SARS-CoV-2 pandemic outbreak based on actual data while reflecting the characteristics of the real city provides beneficial information for taking reasonable infection control measures in the future. We demonstrate agent-based modeling for Tokyo based on GPS information and official national statistics and perform a spatiotemporal analysis of the infection situation in Tokyo. As a result of the simulation during the first wave of SARS-CoV-2 in Tokyo using real GPS data, the infection occurred in the service industry, such as restaurants, in the city center, and then the infected people brought back the virus to the residential area; the infection spread in each area in Tokyo. This phenomenon clarifies that the spread of infection can be curbed by suppressing going out or strengthening infection prevention measures in service facilities. It was shown that pandemic measures in Tokyo could be achieved not only by strong control, such as the lockdown of cities, but also by thorough infection prevention measures in service facilities, which explains the curb phenomena in real Tokyo.
Ordinal Maximin Share Approximation for Goods
Hosseini, Hadi (Pennsylvania State University, Pennsylvania) | Searns, Andrew (Haley Marketing, Williamsville, New York) | Segal-Halevi, Erel
In fair division of indivisible goods,ย โ-out-of-d maximin share (MMS) is the value that an agent can guarantee by partitioning the goods into d bundles and choosing the โ least preferred bundles. Most existing works aim to guarantee to all agents a constant fraction of their 1-out-of-n MMS. But this guarantee is sensitive to small perturbation in agents' cardinal valuations. We consider a more robust approximation notion, which depends only on the agents' ordinal rankings of bundles. Weย prove the existence of โ-out-of-โ(โ + 1/2)nโ MMS allocations of goods for any integer โ โฅ 1, and present a polynomial-time algorithm that finds a 1-out-of-โ3n/2โ MMS allocation when โ=1. We further develop an algorithm that provides a weaker ordinal approximation to MMS for any โ > 1.
New framework for cooperative bots aims to mimic high-performing human teams
A Georgia Institute of Technology research group in the School of Interactive Computing has developed a robotics system for collaborative bots that work independently to achieve a shared goal. The system intelligently increases the information shared among the bots and allows for improved cooperation. The aim is to model high-functioning human teams. It also creates resiliency against bad or unreliable team bots that may hinder the overall programmed goal. "Intuitively, the idea behind our new framework -- InfoPG -- is that a robot agent goes back-and-forth on what it thinks it should do with their teammates, and then the teammates will update on what they think is best to do," said Esmaeil Seraj, Ph.D. student in the CORE Robotics Lab and researcher on the project.
lpSpikeCon: Enabling Low-Precision Spiking Neural Network Processing for Efficient Unsupervised Continual Learning on Autonomous Agents
Putra, Rachmad Vidya Wicaksana, Shafique, Muhammad
Recent advances have shown that SNN-based systems can efficiently perform unsupervised continual learning due to their bio-plausible learning rule, e.g., Spike-Timing-Dependent Plasticity (STDP). Such learning capabilities are especially beneficial for use cases like autonomous agents (e.g., robots and UAVs) that need to continuously adapt to dynamically changing scenarios/environments, where new data gathered directly from the environment may have novel features that should be learned online. Current state-of-the-art works employ high-precision weights (i.e., 32 bit) for both training and inference phases, which pose high memory and energy costs thereby hindering efficient embedded implementations of such systems for battery-driven mobile autonomous systems. On the other hand, precision reduction may jeopardize the quality of unsupervised continual learning due to information loss. Towards this, we propose lpSpikeCon, a novel methodology to enable low-precision SNN processing for efficient unsupervised continual learning on resource-constrained autonomous agents/systems. Our lpSpikeCon methodology employs the following key steps: (1) analyzing the impacts of training the SNN model under unsupervised continual learning settings with reduced weight precision on the inference accuracy; (2) leveraging this study to identify SNN parameters that have a significant impact on the inference accuracy; and (3) developing an algorithm for searching the respective SNN parameter values that improve the quality of unsupervised continual learning. The experimental results show that our lpSpikeCon can reduce weight memory of the SNN model by 8x (i.e., by judiciously employing 4-bit weights) for performing online training with unsupervised continual learning and achieve no accuracy loss in the inference phase, as compared to the baseline model with 32-bit weights across different network sizes.
Robust Solutions for Multi-Defender Stackelberg Security Games
Mutzari, Dolev, Aumann, Yonatan, Kraus, Sarit
Multi-defender Stackelberg Security Games (MSSG) have recently gained increasing attention in the literature. However, the solutions offered to date are highly sensitive, wherein even small perturbations in the attacker's utility or slight uncertainties thereof can dramatically change the defenders' resulting payoffs and alter the equilibrium. In this paper, we introduce a robust model for MSSGs, which admits solutions that are resistant to small perturbations or uncertainties in the game's parameters. First, we formally define the notion of robustness, as well as the robust MSSG model. Then, for the non-cooperative setting, we prove the existence of a robust approximate equilibrium in any such game, and provide an efficient construction thereof. For the cooperative setting, we show that any such game admits a robust approximate alpha-core, provide an efficient construction thereof, and prove that stronger types of the core may be empty. Interestingly, the robust solutions can substantially increase the defenders' utilities over those of the non-robust ones.