AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

KDD 2020 Recognizes Winning Teams of 24th Annual KDD Cup

#artificialintelligenceSep-30-2020, 20:12:06 GMT

KDD Cup Track 3: Automated Machine Learning Competition – AutoML for Graph Representation Learning; KDD Cup Track 4: Reinforcement Learning …

24th annual kdd cup, data mining, reinforcement learning, (3 more...)

#artificialintelligence

Industry: Media > News (0.72)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.56)

Add feedback

AWAC: Accelerating online reinforcement learning with offline datasets

RobohubSep-30-2020, 14:22:00 GMT

Robots trained with reinforcement learning (RL) have the potential to be used across a huge variety of challenging real world problems. To apply RL to a new problem, you typically set up the environment, define a reward function, and train the robot to solve the task by allowing it to explore the new environment from scratch. While this may eventually work, these "online" RL methods are data hungry and repeating this data inefficient process for every new problem makes it difficult to apply online RL to real world robotics problems. What if instead of repeating the data collection and learning process from scratch every time, we were able to reuse data across multiple problems or experiments? By doing so, we could greatly reduce the burden of data collection with every new problem that is encountered.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Robohub

Genre: Instructional Material > Online (0.40)

Industry: Education (0.30)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.74)

Add feedback

Python Data Science with Pandas: Master 12 Advanced Projects

#artificialintelligenceSep-30-2020, 07:17:35 GMT

Online Courses Udemy - Python Data Science with Pandas: Master 12 Advanced Projects, Work with Pandas, SQL Databases, JSON, Web APIs & more to master your real-world Machine Learning & Finance Projects Bestseller Created by Alexander Hagmann English [Auto] Students also bought Machine Learning and AI: Support Vector Machines in Python Unsupervised Machine Learning Hidden Markov Models in Python Natural Language Processing with Deep Learning in Python Advanced AI: Deep Reinforcement Learning in Python Deep Learning: Advanced Computer Vision (GANs, SSD, More!) Cutting-Edge AI: Deep Reinforcement Learning in Python Preview this course GET COUPON CODE Description Welcome to the first advanced and project-based Pandas Data Science Course! This Course starts where many other courses end: You can write some Pandas code but you are still struggling with real-world Projects because Real-World Data is typically not provided in a single or a few text/excel files - more advanced Data Importing Techniques are required Real-World Data is large, unstructured, nested and unclean - more advanced Data Manipulation and Data Analysis/Visualization Techniques are required many easy-to-use Pandas methods work best with relatively small and clean Datasets - real-world Datasets require more General Code (incorporating other Libraries/Modules) No matter if you need excellent Pandas skills for Data Analysis, Machine Learning or Finance purposes, this is the right Course for you to get your skills to Expert Level! This Course covers the full Data Workflow A-Z: Import (complex and nested) Data from JSON files. Efficiently import and merge Data from many text/CSV files. Clean, handle and flatten nested and stringified Data in DataFrames.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.59)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.64)

Add feedback

AI Is Making Robots More Fun – IAM Network

#artificialintelligenceSep-30-2020, 01:18:39 GMT

The "Curly" curling robots are capturing hearts around the world. A product of Korea University in Seoul and the Berlin Institute of Technology, the deep reinforcement learning powered bots slide stones along ice in a winter sport that dates to the 16th century. As much as their human-expert-bettering accuracy or technology impresses, a big part of the Curly appeal is how we see the little machines in the physical space: the determined manner in which the thrower advances in the arena, smartly raising its head-like cameras to survey the shiny white curling sheet, gently cradling and rotating a rock to begin delivery, releasing deftly at the hog line as a skip watches from the backline, with our hopes.Artificial intelligence (AI) today delivers everything from soup recipes to stock predictions, but most tech works out-of-sight. More visible are the physical robots of various shapes, sizes and functions that embody the latest AI technologies. These robots have generally been helpful, and now they are also becoming a more entertaining and enjoyable part of our lives.

iam network, robot

#artificialintelligence

Country: Asia > South Korea > Seoul > Seoul (0.29)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.63)

Add feedback

Entropy Regularization for Mean Field Games with Learning

Guo, Xin, Xu, Renyuan, Zariphopoulou, Thaleia

arXiv.org Machine LearningSep-30-2020

Entropy regularization has been extensively adopted to improve the efficiency, the stability, and the convergence of algorithms in reinforcement learning. This paper analyzes both quantitatively and qualitatively the impact of entropy regularization for Mean Field Game (MFG) with learning in a finite time horizon. Our study provides a theoretical justification that entropy regularization yields time-dependent policies and, furthermore, helps stabilizing and accelerating convergence to the game equilibrium. In addition, this study leads to a policy-gradient algorithm for exploration in MFG. Under this algorithm, agents are able to learn the optimal exploration scheduling, with stable and fast convergence to the game equilibrium.

artificial intelligence, exploration, upstream oil & gas, (19 more...)

arXiv.org Machine Learning

2010.00145

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Texas (0.14)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

MARS-Gym: A Gym framework to model, train, and evaluate Recommender Systems for Marketplaces

Santana, Marlesson R. O., Melo, Luckeciano C., Camargo, Fernando H. F., Brandão, Bruno, Soares, Anderson, Oliveira, Renan M., Caetano, Sandor

arXiv.org Machine LearningSep-30-2020

Recommender Systems are especially challenging for marketplaces since they must maximize user satisfaction while maintaining the healthiness and fairness of such ecosystems. In this context, we observed a lack of resources to design, train, and evaluate agents that learn by interacting within these environments. For this matter, we propose MARS-Gym, an open-source framework to empower researchers and engineers to quickly build and evaluate Reinforcement Learning agents for recommendations in marketplaces. MARS-Gym addresses the whole development pipeline: data processing, model design and optimization, and multi-sided evaluation. We also provide the implementation of a diverse set of baseline agents, with a metrics-driven analysis of them in the Trivago marketplace dataset, to illustrate how to conduct a holistic assessment using the available metrics of recommendation, off-policy estimation, and fairness. With MARS-Gym, we expect to bridge the gap between academic research and production systems, as well as to facilitate the design of new algorithms and applications.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Machine Learning

2010.07035

Country:

North America > United States > New York > New York County > New York City (0.05)
South America > Brazil > Goiás (0.05)
North America > United States > Illinois > Cook County > Chicago (0.04)
(4 more...)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment (0.93)
Consumer Products & Services (0.93)
Information Technology > Services (0.68)
Media > Music (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
(2 more...)

Add feedback

Value-based Bayesian Meta-reinforcement Learning and Traffic Signal Control

Zou, Yayi, Qin, Zhiwei

arXiv.org Machine LearningSep-30-2020

Reinforcement learning methods for traffic signal control has gained increasing interests recently and achieved better performances compared with traditional transportation methods. However, reinforcement learning based methods usually requires heavy training data and computational resources which largely limit its application in real-world traffic signal control. This makes meta-learning, which enables data-efficient and fast-adaptation training by leveraging the knowledge of previous learning experiences, catches attentions in traffic signal control. In this paper, we propose a novel value-based Bayesian meta-reinforcement learning framework BM-DQN to robustly speed up the learning process in new scenarios by utilizing well-trained prior knowledge learned from existing scenarios. This framework based on our proposed fast-adaptation variation to Gradient-EM Bayesian Meta-learning and the fast update advantage of DQN, which allows fast adaptation to new scenarios with continual learning ability and robustness to uncertainty. The experiments on 2D navigation and traffic signal control show that our proposed framework adapts more quickly and robustly in new scenarios than previous methods, and specifically, much better continual learning ability in heterogeneous scenarios.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

2010.00163

Country:

Asia > China > Zhejiang Province > Hangzhou (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)

Genre: Research Report (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Bridging the gap between Markowitz planning and deep reinforcement learning

Benhamou, Eric, Saltiel, David, Ungari, Sandrine, Mukhopadhyay, Abhishek

arXiv.org Artificial IntelligenceSep-30-2020

While researchers in the asset management industry have mostly focused on techniques based on financial and risk planning techniques like Markowitz efficient frontier, minimum variance, maximum diversification or equal risk parity, in parallel, another community in machine learning has started working on reinforcement learning and more particularly deep reinforcement learning to solve other decision making problems for challenging task like autonomous driving, robot learning, and on a more conceptual side games solving like Go. This paper aims to bridge the gap between these two approaches by showing Deep Reinforcement Learning (DRL) techniques can shed new lights on portfolio allocation thanks to a more general optimization setting that casts portfolio allocation as an optimal control problem that is not just a one-step optimization, but rather a continuous control optimization with a delayed reward. The advantages are numerous: (i) DRL maps directly market conditions to actions by design and hence should adapt to changing environment, (ii) DRL does not rely on any traditional financial risk assumptions like that risk is represented by variance, (iii) DRL can incorporate additional data and be a multi inputs method as opposed to more traditional optimization methods. We present on an experiment some encouraging results using convolution networks.

machine learning, reinforcement, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2010.09108

Country:

Europe > France (0.05)
North America > United States (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Multi-document Summarization with Maximal Marginal Relevance-guided Reinforcement Learning

Mao, Yuning, Qu, Yanru, Xie, Yiqing, Ren, Xiang, Han, Jiawei

arXiv.org Artificial IntelligenceSep-30-2020

While neural sequence learning methods have made significant progress in single-document summarization (SDS), they produce unsatisfactory results on multi-document summarization (MDS). We observe two major challenges when adapting SDS advances to MDS: (1) MDS involves larger search space and yet more limited training data, setting obstacles for neural methods to learn adequate representations; (2) MDS needs to resolve higher information redundancy among the source documents, which SDS methods are less effective to handle. To close the gap, we present RL-MMR, Maximal Margin Relevance-guided Reinforcement Learning for MDS, which unifies advanced neural SDS methods and statistical measures used in classical MDS. RL-MMR casts MMR guidance on fewer promising candidates, which restrains the search space and thus leads to better representation learning. Additionally, the explicit redundancy measure in MMR helps the neural representation of the summary to better capture redundancy. Extensive experiments demonstrate that RL-MMR achieves state-of-the-art performance on benchmark MDS datasets. In particular, we show the benefits of incorporating MMR into end-to-end learning when adapting SDS to MDS in terms of both learning effectiveness and efficiency.

machine learning, natural language, reinforcement learning, (21 more...)

arXiv.org Artificial Intelligence

2010.00117

Country:

Asia > Middle East > Republic of Türkiye (0.29)
Asia > North Korea (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(30 more...)

Genre: Research Report (1.00)

Industry:

Law Enforcement & Public Safety (0.93)
Government > Regional Government > North America Government > United States Government (0.68)
Government > Military (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)

Add feedback

Learning Rewards from Linguistic Feedback

Sumers, Theodore R., Ho, Mark K., Hawkins, Robert D., Narasimhan, Karthik, Griffiths, Thomas L.

arXiv.org Artificial IntelligenceSep-30-2020

We explore unconstrained natural language feedback as a learning signal for artificial agents. Humans use rich and varied language to teach, yet most prior work on interactive learning from language assumes a particular form of input (e.g. commands). We propose a general framework which does not make this assumption. We decompose linguistic feedback into two components: a grounding to $\textit{features}$ of a Markov decision process and $\textit{sentiment}$ about those features. We then perform an analogue of inverse reinforcement learning, regressing the teacher's sentiment on the features to infer their latent reward function. To evaluate our approach, we first collect a corpus of teaching behavior in a cooperative task where both teacher and learner are human. We use our framework to implement two artificial learners: a simple "literal" model and a "pragmatic" model with additional inductive biases. We baseline these with a neural network trained end-to-end to predict latent rewards. We then repeat our initial experiment pairing human teachers with our models. We find our "literal" and "pragmatic" models successfully learn from live human feedback and offer statistically-significant performance gains over the end-to-end baseline, with the "pragmatic" model approaching human performance on the task. Inspection reveals the end-to-end network learns representations similar to our models, suggesting they reflect emergent properties of the data. Our work thus provides insight into the information structure of naturalistic linguistic feedback as well as methods to leverage it for reinforcement learning.

learner, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2009.14715

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry: Education > Educational Setting > Online (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Add feedback