Goto

Collaborating Authors

 Country


Learning Latent State Spaces for Planning through Reward Prediction

arXiv.org Artificial Intelligence

Model-based reinforcement learning methods typically learn models for high-dimensional state spaces by aiming to reconstruct and predict the original observations. However, drawing inspiration from model-free reinforcement learning, we propose learning a latent dynamics model directly from rewards. In this work, we introduce a model-based planning framework which learns a latent reward prediction model and then plans in the latent state-space. The latent representation is learned exclusively from multi-step reward prediction which we show to be the only necessary information for successful planning. With this framework, we are able to benefit from the concise model-free representation, while still enjoying the data-efficiency of model-based algorithms. We demonstrate our framework in multi-pendulum and multi-cheetah environments where several pendulums or cheetahs are shown to the agent but only one of which produces rewards. In these environments, it is important for the agent to construct a concise latent representation to filter out irrelevant observations. We find that our method can successfully learn an accurate latent reward prediction model in the presence of the irrelevant information while existing model-based methods fail. Planning in the learned latent state-space shows strong performance and high sample efficiency over model-free and model-based baselines.


Is AI different for SE?

arXiv.org Artificial Intelligence

What AI tools are needed for SE? Ideally, we should have simple rules that peek at data, then say "use this tool" or "use that tool". To find such a rule, we explored 120 different data sets addressing numerous problems, including bad smell detection, predicting Github issue close time, bug report analysis, defect prediction and dozens of other non-SE problems. To this data, we apply a SE-based tool that (a)~out-performs the state-of-the-art for these SE problems yet (b)~fails very badly on standard AI problems. In those results, we can find a simple rule for when to use/avoid the SE-based tool. SE data is often about infrequent issues, like the occasional defect, or the rarely exploited security violation, or the requirement that holds for one special case. But as we show, standard AI tools work best when the target is relatively more frequent. Also, we can exploit these special properties of SE, to great effect (to rapidly find better optimizations for SE tasks via a tactic called "dodging", explained in this paper). More generally, this result says we need a new kind of SE research for developing new AI tools that are more suited to SE problems.


ChainerRL: A Deep Reinforcement Learning Library

arXiv.org Artificial Intelligence

In this paper, we introduce ChainerRL, an open-source Deep Reinforcement Learning (DRL) library built using Python and the Chainer deep learning framework. ChainerRL implements a comprehensive set of DRL algorithms and techniques drawn from the state-of-the-art research in the field. To foster reproducible research, and for instructional purposes, ChainerRL provides scripts that closely replicate the original papers' experimental settings and reproduce published benchmark results for several algorithms. Lastly, ChainerRL offers a visualization tool that enables the qualitative inspection of trained agents. The ChainerRL source code can be found on GitHub: https://github.com/chainer/chainerrl .


An Action Language for Multi-Agent Domains: Foundations

arXiv.org Artificial Intelligence

In multi-agent domains (MADs), an agent's action may not just change the world and the agent's knowledge and beliefs about the world, but also may change other agents' knowledge and beliefs about the world and their knowledge and beliefs about other agents' knowledge and beliefs about the world. The goals of an agent in a multi-agent world may involve manipulating the knowledge and beliefs of other agents' and again, not just their knowledge/belief about the world, but also their knowledge about other agents' knowledge about the world. Our goal is to present an action language (mA+) that has the necessary features to address the above aspects in representing and RAC in MADs. mA+ allows the representation of and reasoning about different types of actions that an agent can perform in a domain where many other agents might be present -- such as world-altering actions, sensing actions, and announcement/communication actions. It also allows the specification of agents' dynamic awareness of action occurrences which has future implications on what agents' know about the world and other agents' knowledge about the world. mA+ considers three different types of awareness: full-, partial- awareness, and complete oblivion of an action occurrence and its effects. This keeps the language simple, yet powerful enough to address a large variety of knowledge manipulation scenarios in MADs. The semantics of mA+ relies on the notion of state, which is described by a pointed Kripke model and is used to encode the agent's knowledge and the real state of the world. It is defined by a transition function that maps pairs of actions and states into sets of states. We illustrate properties of the action theories, including properties that guarantee finiteness of the set of initial states and their practical implementability. Finally, we relate mA+ to other related formalisms that contribute to RAC in MADs.


A detailed example of data loaders with PyTorch

#artificialintelligence

Have you ever had to load a dataset that was so memory consuming that you wished a magic trick could seamlessly take care of that? Large datasets are increasingly becoming part of our lives, as we are able to harness an ever-growing quantity of data. We have to keep in mind that in some cases, even the most state-of-the-art configuration won't have enough memory space to process the data the way we used to do it. That is the reason why we need to find other ways to do that task efficiently. In this blog post, we are going to show you how to generate your data on multiple cores in real time and feed it right away to your deep learning model.


Machine Learning Glossary Google Developers

#artificialintelligence

Layers are Python functions that take Tensors and configuration options as input and produce other tensors as output. Once the necessary Tensors have been composed, the user can convert the result into an Estimator via a model function.


Goodbye herbicide, hello weed-zapping farmbot Sifted

#artificialintelligence

Farmers may soon have an alternative to spraying their fields with chemicals, as Small Robot Company and RootWave, two UK-based agritech startups, today announced a partnership to develop a high-precision robot that can kill weeds with a zap of electricity. Small Robot has already developed a series of small, agricultural robots, called Tom, Dick and Harry, which can automate some of the routine tasks of farming. Tom, a scouting robot similar to the Mars Rover, for example, uses computer vision to map the weeds in a field, covering about 20 hectares a day. Dick, a weeding robot, can already remove unwanted plants with either a micro-dose of pesticide or by physically crushing them, but the next stage will be to combine this with technology from RootWave, which destroys weeds by with an electric current, essentially boiling them from the inside out. "Farmers are really desperate for an alternative to the chemical control of weeds," says Sam Watson Jones, the chief executive of Small Robot Company.


The Age of Quantum Supremacy - IRIS

#artificialintelligence

I am not a genius, have no inside information and don't have influential friends feeding me high-tech solutions to common problems. However, people wonder why I have huge social media followings, know what the next big thing is and have opportunities thrust upon me. The simple answer is I love watching for disruptive technologies and I follow trends--have done so for years. I think you must be on top of what is happening around you and gather intelligence on what technology can change the world and what technology is simple taking up space like marijuana. I've watched the marijuana "technologies" reaping incredible rewards from the few who got in at the early stages.


Financial industry fears AI could decimate high-paying positions

#artificialintelligence

At a two-hour hearing in Washington, D.C. on Friday, lawmakers questioned experts on bias in artificial intelligence, the struggle to attract skilled workers, and how to navigate and regulate an increasingly data-driven financial market, Bloomberg reports. Why it matters, per Bloomberg: "The use of algorithms in electronic markets has automated the jobs of tens of thousands of execution traders worldwide, and it's also displaced people who model prices and risk or build investment portfolios," the former head of machine learning at AQR Capital Management LLC Marcos Lopez de Prado said.


Tesla Reveals Specs of its New AI-Powered Full Self-Driving Computer

#artificialintelligence

In April, at a special event at Tesla's Palo Alto, California headquarters called Tesla Autonomy Investor Day, Tesla CEO Elon Musk announced that Tesla vehicles are using a new custom-designed processor to power its Autopilot full self-driving (FSD) system. At the time Musk said that no chip was available that had the processing power and power constraints that Tesla required, so the automaker built its own from scratch. Now the technical details of the new chip have been revealed for the first time. At the Hot Chips conference in San Francisco on Tuesday, Tesla's VP of hardware engineering Pete Bannon revealed of the details of the chipset that will power Tesla's future Autopilot, full self-driving (FSD) system. Bannon said that the new AI-powered chip is 21 times faster that the Nvidia chip it's replacing and only 80% of the cost.