AITopics

arXiv.org Artificial IntelligenceMay-20-2020

A reinforcement learning based decision support system in textile manufacturing process

He, Zhenglei, Tran, Kim Phuc, Thomassey, Sébastien, Zeng, Xianyi, Yi, Changhai

This paper introduced a reinforcement learning based decision support system in textile manufacturing process. A solution optimization problem of color fading ozonation is discussed and set up as a Markov Decision Process (MDP) in terms of tuple {S, A, P, R}. Q-learning is used to train an agent in the interaction with the setup environment by accumulating the reward R. According to the application result, it is found that the proposed MDP model has well expressed the optimization problem of textile manufacturing process discussed in this paper, therefore the use of reinforcement learning to support decision making in this sector is conducted and proven that is applicable with promising prospects.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2005.09867

Country:

Asia > China > Hubei Province > Wuhan (0.05)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report (0.65)

Industry: Textiles, Apparel & Luxury Goods (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Etheve, Marc, Alès, Zacharie, Bissuel, Côme, Juan, Olivier, Kedad-Sidhoum, Safia

Reinforcement Learning for Variable Selection in a Branch and Bound Algorithm

arXiv.org Machine LearningMay-20-2020

Mixed integer linear programs are commonly solved by Branch and Bound algorithms. A key factor of the efficiency of the most successful commercial solvers is their fine-tuned heuristics. In this paper, we leverage patterns in real-world instances to learn from scratch a new branching strategy optimised for a given problem and compare it with a commercial solver. We propose FMSTS, a novel Reinforcement Learning approach specifically designed for this task. The strength of our method lies in the consistency between a local value function and a global metric of interest. In addition, we provide insights for adapting known RL techniques to the Branch and Bound setting, and present a new neural network architecture inspired from the literature. To our knowledge, it is the first time Reinforcement Learning has been used to fully optimise the branching strategy. Computational experiments show that our method is appropriate and able to generalise well to new instances.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Machine Learning

2005.10026

Country: Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.64)

Industry: Energy > Power Industry > Utilities (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Machine LearningMay-20-2020

Deep Reinforcement Learning for High Level Character Control

Souza, Caio, Velho, Luiz

In this paper, we propose the use of traditional animations, heuristic behavior and reinforcement learning in the creation of intelligent characters for computational media. The traditional animation and heuristic gives artistic control over the behavior while the reinforcement learning adds generalization. The use case presented is a dog character with a high-level controller in a 3D environment which is built around the desired behaviors to be learned, such as fetching an item. As the development of the environment is the key for learning, further analysis is conducted of how to build those learning environments, the effects of environment and agent modeling choices, training procedures and generalization of the learned behavior. This analysis builds insight of the aforementioned factors and may serve as guide in the development of environments in general.

action space, agent, information, (17 more...)

arXiv.org Machine Learning

doi: 10.1007/978-3-030-80126-7_49

2005.10391

Country: South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)

Genre: Research Report (0.40)

Industry:

Education (0.49)
Leisure & Entertainment > Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

#artificialintelligenceMay-19-2020, 02:26:20 GMT

What You Need to Know About Deep Reinforcement Learning - KDnuggets

It is useful, for the forthcoming discussion, to have a better understanding of some key terms used in RL. Agent: A software/hardware mechanism which takes certain action depending on its interaction with the surrounding environment; for example, a drone making a delivery, or Super Mario navigating a video game. The algorithm is the agent. Action: An action is one of all the possible moves the agent can make. An action is almost self-explanatory, but it should be noted that agents usually choose from a list of discrete possible actions.

computer game, deep learning, neural network, (21 more...)

#artificialintelligence

Industry:

Energy > Oil & Gas (1.00)
Leisure & Entertainment > Games > Computer Games (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Luo, Hongyin, Li, Shang-Wen, Glass, James

Prototypical Q Networks for Automatic Conversational Diagnosis and Few-Shot New Disease Adaption

Spoken dialog systems have seen applications in many domains, including medical for automatic conversational diagnosis. State-of-the-art dialog managers are usually driven by deep reinforcement learning models, such as deep Q networks (DQNs), which learn by interacting with a simulator to explore the entire action space since real conversations are limited. However, the DQN-based automatic diagnosis models do not achieve satisfying performances when adapted to new, unseen diseases with only a few training samples. In this work, we propose the Prototypical Q Networks (ProtoQN) as the dialog manager for the automatic diagnosis systems. The model calculates prototype embeddings with real conversations between doctors and patients, learning from them and simulator-augmented dialogs more efficiently. We create both supervised and few-shot learning tasks with the Muzhi corpus. Experiments showed that the ProtoQN significantly outperformed the baseline DQN model in both supervised and few-shot learning scenarios, and achieves state-of-the-art few-shot learning performances.

machine learning, natural language, reinforcement learning, (18 more...)

2005.11153

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Kamienny, Pierre-Alexandre, Arulkumaran, Kai, Behbahani, Feryal, Boehmer, Wendelin, Whiteson, Shimon

Privileged Information Dropout in Reinforcement Learning

Using privileged information during training can improve the sample efficiency and performance of machine learning systems. This paradigm has been applied to reinforcement learning (RL), primarily in the form of distillation or auxiliary tasks, and less commonly in the form of augmenting the inputs of agents. In this work, we investigate Privileged Information Dropout (PI-Dropout) for achieving the latter which can be applied equally to value-based and policy-based RL algorithms. Within a simple partially-observed environment, we demonstrate that PI-Dropout outperforms alternatives for leveraging privileged information, including distillation and auxiliary tasks, and can successfully utilise different types of privileged information. Finally, we analyse its effect on the learned representations.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

2005.0922

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.15)
North America > United States > Virginia (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Geeraerts, Gilles, Guha, Shibashis, Pérez, Guillermo A., Raskin, Jean-François

Safe Learning for Near Optimal Scheduling

In this paper, we investigate the combination of synthesis techniques and learning techniques to obtain safe and near optimal schedulers for a preemptible task scheduling problem. We study both model-based learning techniques with PAC guarantees and model-free learning techniques based on shielded deep Q-learning. The new learning algorithms have been implemented to conduct experimental evaluations.

machine learning, reinforcement learning, task system, (19 more...)

2005.09253

Country: Europe > Belgium > Flanders > Antwerp Province > Antwerp (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Experience Augmentation: Boosting and Accelerating Off-Policy Multi-Agent Reinforcement Learning

Ye, Zhenhui, Chen, Yining, Song, Guanghua, Yang, Bowei, Fan, Shen

Exploration of the high-dimensional state action space is one of the biggest challenges in Reinforcement Learning (RL), especially in multi-agent domain. We present a novel technique called Experience Augmentation, which enables a time-efficient and boosted learning based on a fast, fair and thorough exploration to the environment. It can be combined with arbitrary off-policy MARL algorithms and is applicable to either homogeneous or heterogeneous environments. We demonstrate our approach by combining it with MADDPG and verifing the performance in two homogeneous and one heterogeneous environments. In the best performing scenario, the MADDPG with experience augmentation reaches to the convergence reward of vanilla MADDPG with 1/4 realistic time, and its convergence beats the original model by a significant margin. Our ablation studies show that experience augmentation is a crucial ingredient which accelerates the training process and boosts the convergence.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

2005.09453

Country: Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report > Promising Solution (0.54)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

A Survey of Reinforcement Learning Algorithms for Dynamically Varying Environments

Padakandla, Sindhu

Reinforcement learning (RL) algorithms find applications in inventory control, recommender systems, vehicular traffic management, cloud computing and robotics. The real-world complications of many tasks arising in these domains makes them difficult to solve with the basic assumptions underlying classical RL algorithms. RL agents in these applications often need to react and adapt to changing operating conditions. A significant part of research on single-agent RL techniques focuses on developing algorithms when the underlying assumption of stationary environment model is relaxed. This paper provides a survey of RL methods developed for handling dynamically varying environment models. The goal of methods not limited by the stationarity assumption is to help autonomous agents adapt to varying operating conditions. This is possible either by minimizing the rewards lost during learning by RL agent or by finding a suitable policy for the RL agent which leads to efficient operation of the underlying system. A representative collection of these algorithms is discussed in detail in this work along with their categorization and their relative merits and demerits. Additionally we also review works which are tailored to application domains. Finally, we discuss future enhancements for this field.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2005.10619

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre: Overview (1.00)

Industry:

Health & Medicine (0.93)
Transportation > Infrastructure & Services (0.68)
Transportation > Ground > Road (0.67)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)