AITopics

2003.11102

Country:

South America > Brazil > Pernambuco > Recife (0.07)
North America > Canada > Quebec > Montreal (0.05)
Europe > Portugal > Braga > Braga (0.05)

Genre: Research Report (0.70)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.55)

#artificialintelligenceMar-23-2020, 19:38:58 GMT

Putting artificial intelligence to work in the lab: Automated scanning probe microscopy controlled by artificial intelligence/machine learning

The new system, dubbed DeepSPM, bridges the gap between nanoscience, automation and artificial intelligence (AI), and firmly establishes the use of machine learning for experimental scientific research. "Optimising SPM data acquisition can be very tedious. This optimisation process is usually performed by the human experimentalist, and is rarely reported," says FLEET Chief Investigator Dr Agustin Schiffrin (Monash University). "Our new AI-driven system can operate and acquire optimal SPM data autonomously, for multiple straight days, and without any human supervision." The advance brings advanced SPM methodologies such as atomically-precise nanofabrication and high-throughput data acquisition closer to a fully automated turnkey application.

artificial intelligence, artificial intelligence machine, probe microscopy, (6 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.37)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Deep Reinforcement Learning with Smooth Policy

Shen, Qianli, Li, Yan, Jiang, Haoming, Wang, Zhaoran, Zhao, Tuo

Deep neural networks have been widely adopted in modern reinforcement learning (RL) algorithms with great empirical successes in various domains. However, the large search space of training a neural network requires a significant amount of data, which makes the current RL algorithms not sample efficient. Motivated by the fact that many environments with continuous state space have smooth transitions, we propose to learn a smooth policy that behaves smoothly with respect to states. In contrast to policies parameterized by linear/reproducing kernel functions, where simple regularization techniques suffice to control smoothness, for neural network based reinforcement learning algorithms, there is no readily available solution to learn a smooth policy. In this paper, we develop a new training framework --- $\textbf{S}$mooth $\textbf{R}$egularized $\textbf{R}$einforcement $\textbf{L}$earning ($\textbf{SR}^2\textbf{L}$), where the policy is trained with smoothness-inducing regularization. Such regularization effectively constrains the search space of the learning algorithms and enforces smoothness in the learned policy. We apply the proposed framework to both on-policy (TRPO) and off-policy algorithm (DDPG). Through extensive experiments, we demonstrate that our method achieves improved sample efficiency.

algorithm, cumulative reward, regularizer, (14 more...)

2003.09534

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Sheikh, Hassam Ullah, Bölöni, Ladislau

Multi-Agent Reinforcement Learning for Problems with Combined Individual and Team Reward

Many cooperative multi-agent problems require agents to learn individual tasks while contributing to the collective success of the group. This is a challenging task for current state-of-the-art multi-agent reinforcement algorithms that are designed to either maximize the global reward of the team or the individual local rewards. The problem is exacerbated when either of the rewards is sparse leading to unstable learning. To address this problem, we present Decomposed Multi-Agent Deep Deterministic Policy Gradient (DE-MADDPG): a novel cooperative multi-agent reinforcement learning framework that simultaneously learns to maximize the global and local rewards. We evaluate our solution on the challenging defensive escort team problem and show that our solution achieves a significantly better and more stable performance than the direct adaptation of the MADDPG algorithm.

agent, maddpg, maupg, (14 more...)

2003.10598

Country: North America > United States > Florida > Orange County > Orlando (0.04)

Genre: Research Report (0.83)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)

Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning

Long, Qian, Zhou, Zihan, Gupta, Abhibav, Fang, Fei, Wu, Yi, Wang, Xiaolong

In multi-agent games, the complexity of the environment can grow exponentially as the number of agents increases, so it is particularly challenging to learn good policies when the agent population is large. In this paper, we introduce Evolutionary Population Curriculum (EPC), a curriculum learning paradigm that scales up Multi-Agent Reinforcement Learning (MARL) by progressively increasing the population of training agents in a stage-wise manner. Furthermore, EPC uses an evolutionary approach to fix an objective misalignment issue throughout the curriculum: agents successfully trained in an early stage with a small population are not necessarily the best candidates for adapting to later stages with scaled populations. Concretely, EPC maintains multiple sets of agents in each stage, performs mix-and-match and fine-tuning over these sets and promotes the sets of agents with the best adaptability to the next stage. We implement EPC on a popular MARL algorithm, MADDPG, and empirically show that our approach consistently outperforms baselines by a large margin as the number of agents grows exponentially. The project page is https://sites.google.com/view/epciclr2020.

agent, conference paper, reinforcement learning, (14 more...)

2003.10423

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Etemad, Mohammad, Zare, Nader, Sarvmaili, Mahtab, Soares, Amilcar, Machado, Bruno Brandoli, Matwin, Stan

Using Deep Reinforcement Learning Methods for Autonomous Vessels in 2D Environments

Unmanned Surface Vehicles technology (USVs) is an exciting topic that essentially deploys an algorithm to safely and efficiently performs a mission. Although reinforcement learning is a well-known approach to modeling such a task, instability and divergence may occur when combining off-policy and function approximation. In this work, we used deep reinforcement learning combining Q-learning with a neural representation to avoid instability. Our methodology uses deep q-learning and combines it with a rolling wave planning approach on agile methodology. Our method contains two critical parts in order to perform missions in an unknown environment. The first is a path planner that is responsible for generating a potential effective path to a destination without considering the details of the root. The latter is a decision-making module that is responsible for short-term decisions on avoiding obstacles during the near future steps of USV exploitation within the context of the value function. Simulations were performed using two algorithms: a basic vanilla vessel navigator (VVN) as a baseline and an improved one for the vessel navigator with a planner and local view (VNPLV). Experimental results show that the proposed method enhanced the performance of VVN by 55.31 on average for long-distance missions. Our model successfully demonstrated obstacle avoidance by means of deep reinforcement learning using planning adaptive paths in unknown environments.

agent, destination point, obstacle, (16 more...)

2003.10249

Country:

North America > Canada > Nova Scotia > Halifax Regional Municipality > Halifax (0.04)
Europe > Poland > Masovia Province > Warsaw (0.04)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Industry: Transportation (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

#artificialintelligenceMar-22-2020, 19:14:46 GMT

Google algorithm teaches robot how to walk in mere hours

A new robot has overcome a fundamental challenge of locomotion by teaching itself how to walk. Researchers from Google developed algorithms that helped the four-legged bot to learn how to walk across a range of surfaces within just hours of practice, annihilating the record times set by its human overlords. Their system uses deep reinforcement learning, a form of AI that teaches through trial and error by providing rewards for certain actions. This technique is typically evaluated in virtual environments. However, building simulations that could replicate the robot walking on various surfaces would be highly complex and time-consuming, so the researchers chose to train their system in the real world.

google algorithm teach robot, mere hour, robot, (4 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.59)

#artificialintelligenceMar-21-2020, 10:59:18 GMT

r/deeplearning - The Enhanced POET: Open-Ended Reinforcement Learning

I hope this is useful for someone. I think the algorithms this research group has been developing are the future of AI, rather than trying to see how we can train 3 Trillion parameter transformers lol!

enhanced poet, open-ended reinforcement learning

Industry: Media > News (0.40)

Technology:

Information Technology > Communications > Social Media (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

#artificialintelligenceMar-21-2020, 10:51:11 GMT

Deep Reinforcement Learning Course

Some of the agents you'll implement during this course: This course is a series of articles and videos where you'll master the skills and architectures you need, to become a deep reinforcement learning expert. You'll build a strong professional portfolio by implementing awesome agents with Tensorflow that learns to play Space invaders, Doom, Sonic the hedgehog and more!

artificial intelligence, deep reinforcement learning course, machine learning

Industry: Education (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Machine LearningMar-21-2020

Comprehensive Review of Deep Reinforcement Learning Methods and Applications in Economics

Mosavi, Amir, Ghamisi, Pedram, Faghan, Yaser, Duan, Puhong

The popularity of deep reinforcement learning (DRL) methods in economics have been exponentially increased. DRL through a wide range of capabilities from reinforcement learning (RL) and deep learning (DL) for handling sophisticated dynamic business environments offers vast opportunities. DRL is characterized by scalability with the potential to be applied to high-dimensional problems in conjunction with noisy and nonlinear patterns of economic data. In this work, we first consider a brief review of DL, RL, and deep RL methods in diverse applications in economics providing an in-depth insight into the state of the art. Furthermore, the architecture of DRL applied to economic applications is investigated in order to highlight the complexity, robustness, accuracy, performance, computational tasks, risk constraints, and profitability. The survey results indicate that DRL can provide better performance and higher accuracy as compared to the traditional algorithms while facing real economic problems at the presence of risk parameters and the ever-increasing uncertainties.

algorithm, learning, reinforcement, (13 more...)

arXiv.org Machine Learning

doi: 10.20944/preprints202003.0309.v1

2004.01509

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
Europe > Germany > Saxony > Dresden (0.04)
(5 more...)

Genre:

Overview (1.00)
Workflow (0.93)
Research Report > New Finding (0.93)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Information Technology (1.00)
Banking & Finance > Trading (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)