Goto

Collaborating Authors

 Agents


Computer Vision and Visual SLAM vs. AI Agents

#artificialintelligence

To take a look at what the end goal in terms of end-to-end deep learning for visual SLAM might look like, take a look at gradSLAM from Krishna Murthy, a Ph.D. student in MILA, and collaborators at CMU. Their paper offers a new way of thinking of SLAM as made up of differentiable blocks. From the article, "This amalgamation of dense SLAM with computational graphs enables us to backprop from 3D maps to 2D pixels, opening up new possibilities in gradient-based learning for SLAM." We are seeing more and more practical successes of self-supervised learning for multi-view problems where geometry enables us to get away from strong supervision. Even the ConvNet-based point detector SuperPoint [7], which my team and I developed at Magic Leap, uses self-supervision to train more robust interest point detectors.


This robot can conduct research in previously unexplored parts of the sea

#artificialintelligence

At first, PLUMES will choose paths that randomly explore the environment. Each sample, however, provides new information about the targeted values in the surrounding environment -- such as spots with highest concentrations of chemicals or shallowest depths. The Gaussian process model exploits that data to narrow down possible paths the robot can follow from its given position to sample from locations with even higher value. PLUMES uses a novel objective function -- commonly used in machine-learning to maximize a reward -- to make the call of whether the robot should exploit past knowledge or explore the new area.


It's Sony AI vs. Facebook, Google

#artificialintelligence

Sony Corp. has launched Sony AI, a new organization to pursue advanced R&D in artificial intelligence. With this move, the Japanese consumer electronics giant intends to go head-to-head with Google and Facebook, competing for AI talent and projects, and targeting a much bigger role in an ever-accelerating global AI race. The new organization will be worldwide from day one, with research sites in Tokyo, Austin, Texas, and an unnamed city in Europe. Sony AI will formally start operation next month. Hiroaki Kitano, president and CEO, Sony Computer Science Laboratories, Inc., will run Sony AI globally.


Predictive properties of forecast combination, ensemble methods, and Bayesian predictive synthesis

arXiv.org Machine Learning

This paper studies the theoretical predictive properties of classes of forecast combination methods. The study is motivated by the recently developed Bayesian framework for synthesizing predictive densities: Bayesian predictive synthesis. A novel strategy based on continuous time stochastic processes is proposed and developed, where the combined predictive error processes are expressed as stochastic differential equations, evaluated using Ito's lemma. We show that a subclass of synthesis functions under Bayesian predictive synthesis, which we categorize as non-linear synthesis, entails an extra term that "corrects" the bias from misspecification and dependence in the predictive error process, effectively improving forecasts. Theoretical properties are examined and shown that this subclass improves the expected squared forecast error over any and all linear combination, averaging, and ensemble of forecasts, under mild conditions. We discuss the conditions for which this subclass outperforms others, and its implications for developing forecast combination methods. A finite sample simulation study is presented to illustrate our results.


Defending with Shared Resources on a Network

arXiv.org Artificial Intelligence

In this paper we consider a defending problem on a network. In the model, the defender holds a total defending resource of R, which can be distributed to the nodes of the network. The defending resource allocated to a node can be shared by its neighbors. There is a weight associated with every edge that represents the efficiency defending resources are shared between neighboring nodes. We consider the setting when each attack can affect not only the target node, but its neighbors as well. Assuming that nodes in the network have different treasures to defend and different defending requirements, the defender aims at allocating the defending resource to the nodes to minimize the loss due to attack. We give polynomial time exact algorithms for two important special cases of the network defending problem. For the case when an attack can only affect the target node, we present an LP-based exact algorithm. For the case when defending resources cannot be shared, we present a max-flow-based exact algorithm. We show that the general problem is NP-hard, and we give a 2-approximation algorithm based on LP-rounding. Moreover, by giving a matching lower bound of 2 on the integrality gap on the LP relaxation, we show that our rounding is tight.


How autonomous systems use AI that learns from the world around it

#artificialintelligence

If a mine collapses or an earthquake strands people underground in a subway car, first responders can't rush into that unknown subterranean environment without potentially endangering themselves. A rescue team must ensure an area is structurally sound and air is breathable before pushing forward -- which sometimes means help moves slower than anyone would like. In a competition sponsored by DARPA, teams are designing autonomous robots that can explore and map these potentially dangerous underground landscapes and also identify objects of interest to first responders like survivors, backpacks, cell phones or fire extinguishers. "With a robot, you're able to take much more risk and potentially move much faster in a rescue," said Sebastian Scherer, Carnegie Mellon University associate research professor and co-leader of Team Explorer, which took first place in the initial leg of that Subterranean Challenge using Microsoft's AirSim technology to train its robots to recognize objects in a simulated mine. "It's really difficult to design a system to operate in an environment where you really have no idea what's coming next. It has to be very robust and be able to make decisions on its own to get itself out of trouble," Scherer said.


Inducing Cooperation via Team Regret Minimization based Multi-Agent Deep Reinforcement Learning

arXiv.org Artificial Intelligence

Existing value-factorized based Multi-Agent deep Reinforce-ment Learning (MARL) approaches are well-performing invarious multi-agent cooperative environment under thecen-tralized training and decentralized execution(CTDE) scheme,where all agents are trained together by the centralized valuenetwork and each agent execute its policy independently. How-ever, an issue remains open: in the centralized training process,when the environment for the team is partially observable ornon-stationary, i.e., the observation and action informationof all the agents cannot represent the global states, existingmethods perform poorly and sample inefficiently. Regret Min-imization (RM) can be a promising approach as it performswell in partially observable and fully competitive settings.However, it tends to model others as opponents and thus can-not work well under the CTDE scheme. In this work, wepropose a novel team RM based Bayesian MARL with threekey contributions: (a) we design a novel RM method to traincooperative agents as a team and obtain a team regret-basedpolicy for that team; (b) we introduce a novel method to de-compose the team regret to generate the policy for each agentfor decentralized execution; (c) to further improve the perfor-mance, we leverage a differential particle filter (a SequentialMonte Carlo method) network to get an accurate estimation ofthe state for each agent. Experimental results on two-step ma-trix games (cooperative game) and battle games (large-scalemixed cooperative-competitive games) demonstrate that ouralgorithm significantly outperforms state-of-the-art methods.


Leveraging Decentralized Artificial Intelligence to Enhance Resilience of Energy Networks

arXiv.org Artificial Intelligence

This paper reintroduces the notion of resilience in the context of recent issues originated from climate change triggered events including severe hurricanes and wildfires. A recent example is PG&E's forced power outage to contain wildfire risk which led to widespread power disruption. This paper focuses on answering two questions: who is responsible for resilience? and how to quantify the monetary value of resilience? To this end, we first provide preliminary definitions of resilience for power systems. We then investigate the role of natural hazards, especially wildfire, on power system resilience. Finally, we will propose a decentralized strategy for a resilient management system using distributed storage and demand response resources. Our proposed high fidelity model provides utilities, operators, and policymakers with a clearer picture for strategic decision making and preventive decisions.


Hebbian Synaptic Modifications in Spiking Neurons that Learn

arXiv.org Machine Learning

In this paper, we derive a new model of synaptic plasticity, b ased on recent algorithms for reinforcement learning (in which an age nt attempts to learn appropriate actions to maximize its long-term averag e reward). We show that these direct reinforcement learning algorithms a lso give locally optimal performance for the problem of reinforcement learn ing with multiple agents, without any explicit communication between a gents. By considering a network of spiking neurons as a collection of agen ts attempting to maximize the long-term average of a reward signal, we deri ve a synaptic update rule that is qualitatively similar to Hebb's post ulate. This rule requires only simple computations, such as addition and lea ky integration, and involves only quantities that are available in the vicin ity of the synapse. Furthermore, it leads to synaptic connection strengths tha t give locally optimal values of the long term average reward. The reinforcem ent learning paradigm is sufficiently broad to encompass many learning pr oblems that are solved by the brain. We illustrate, with simulations, th at the approach is effective for simple pattern classification and motor learn ing tasks. It is widely accepted that the functions performed by neural circuits are modified by adjustments to the strength of the synaptic connectio ns between neurons. 1 In the 1940s, Donald Hebb speculated that such adjustments a re associated with simultaneous (or nearly simultaneous) firing of the presyna ptic and postsynaptic neurons [14]: When an axon of cell A ... persistently takes part in firing [cell B ], some growth process or metabolic change takes place [to incr ease] A's efficacy as one of the cells firing B .


IKEA Furniture Assembly Environment for Long-Horizon Complex Manipulation Tasks

arXiv.org Artificial Intelligence

The IKEA Furniture Assembly Environment is one of the first benchmarks for testing and accelerating the automation of complex manipulation tasks. The environment is designed to advance reinforcement learning from simple toy tasks to complex tasks requiring both long-term planning and sophisticated low-level control. Our environment supports over 80 different furniture models, Sawyer and Baxter robot simulation, and domain randomization. The IKEA Furniture Assembly Environment is a testbed for methods aiming to solve complex manipulation tasks. The environment is publicly available at https://clvrai.com/furniture