AITopics

1712.00948

Country:

North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

#artificialintelligenceFeb-28-2019, 23:35:59 GMT

New machine learning approach could give a big boost to the efficiency of optical networks

New machine learning approach could give a big boost to the efficiency of optical networks February 25, 2019, Optical Society of America Credit: CC0 Public Domain New work leveraging machine learning could increase the efficiency of optical telecommunications networks. As our world becomes increasingly interconnected, fiber optic cables offer the ability to transmit more data over longer distances compared to traditional copper wires. Optical Transport Networks (OTNs) have emerged as a solution for packaging data in fiber optic cables, and improvements stand to make them more cost-effective. A group of researchers from Universitat Politècnica de Catalunya in Barcelona and the telecom company Huawei have retooled an artificial intelligence technique used for chess and self-driving cars to make OTNs run more efficiently. They will present their research at the upcoming Optical Fiber Conference and Exposition, to be held 3-7 March in San Diego, California, USA.

machine learning, reinforcement, reinforcement learning, (18 more...)

#artificialintelligence

Country:

North America > United States > California > San Diego County > San Diego (0.25)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.15)

Genre: Research Report (0.50)

Industry:

Transportation > Ground > Road (0.91)
Telecommunications > Networks (0.61)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.86)

Kolesnikov, Sergey, Hrinchuk, Oleksii

Catalyst.RL: A Distributed Framework for Reproducible RL Research

arXiv.org Machine LearningFeb-28-2019

Despite the recent progress in deep reinforcement learning field (RL), and, arguably because of it, a large body of work remains to be done in reproducing and carefully comparing different RL algorithms. We present catalyst.RL, an open source framework for RL research with a focus on reproducibility and flexibility. Main features of our library include large-scale asynchronous distributed training, easy-to-use configuration files with the complete list of hyperparameters for the particular experiments, efficient implementations of various RL algorithms and auxiliary tricks, such as frame stacking, n-step returns, value distributions, etc. To vindicate the usefulness of our framework, we evaluate it on a range of benchmarks in a continuous control, as well as on the task of developing a controller to enable a physiologically-based human model with a prosthetic leg to walk and run. The latter task was introduced at NeurIPS 2018 AI for Prosthetics Challenge, where our team took the 3rd place, capitalizing on the ability of catalyst.RL to train high-quality and sample-efficient RL agents.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

1903.00027

Country: Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Games (0.94)
Materials > Chemicals > Specialty Chemicals (0.85)
Health & Medicine (0.75)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Huang, Jiayi, Patwary, Mostofa, Diamos, Gregory

Coloring Big Graphs with AlphaGoZero

arXiv.org Artificial IntelligenceFeb-28-2019

We show that recent innovations in deep reinforcement learning can effectively color very large graphs -- a well-known NP-hard problem with clear commercial applications. Because the Monte Carlo Tree Search with Upper Confidence Bound algorithm used in AlphaGoZero can improve the performance of a given heuristic, our approach allows deep neural networks trained using high performance computing (HPC) technologies to transform computation into improved heuristics with zero prior knowledge. Key to our approach is the introduction of a novel deep neural network architecture (FastColorNet) that has access to the full graph context and requires $O(V)$ time and space to color a graph with $V$ vertices, which enables scaling to very large graphs that arise in real applications like parallel computing, compilers, numerical solvers, and design automation, among others. As a result, we are able to learn new state of the art heuristics for graph coloring.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

1902.10162

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Kiourti, Panagiota, Wardega, Kacper, Jha, Susmit, Li, Wenchao

TrojDRL: Trojan Attacks on Deep Reinforcement Learning Agents

arXiv.org Machine LearningFeb-28-2019

Recent work has identified that classification models implemented as neural networks are vulnerable to data-poisoning and Trojan attacks at training time. In this work, we show that these training-time vulnerabilities extend to deep reinforcement learning (DRL) agents and can be exploited by an adversary with access to the training process. In particular, we focus on Trojan attacks that augment the function of reinforcement learning policies with hidden behaviors. We demonstrate that such attacks can be implemented through minuscule data poisoning (as little as 0.025% of the training data) and in-band reward modification that does not affect the reward on normal inputs. The policies learned with our proposed attack approach perform imperceptibly similar to benign policies but deteriorate drastically when the Trojan is triggered in both targeted and untargeted settings. Furthermore, we show that existing Trojan defense mechanisms for classification tasks are not effective in the reinforcement learning setting.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

1903.06638

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Chourasia, Rishav, Singla, Adish

Unifying Ensemble Methods for Q-learning via Social Choice Theory

arXiv.org Artificial IntelligenceFeb-27-2019

Ensemble methods have been widely applied in Reinforcement Learning (RL) in order to enhance stability, increase convergence speed, and improve exploration. These methods typically work by employing an aggregation mechanism over actions of different RL algorithms. We show that a variety of these methods can be unified by drawing parallels from committee voting rules in Social Choice Theory. We map the problem of designing an action aggregation mechanism in an ensemble method to a voting problem which, under different voting rules, yield popular ensemble-based RL algorithms like Majority Voting Q-learning or Bootstrapped Q-learning. Our unification framework, in turn, allows us to design new ensemble-RL algorithms with better performance. For instance, we map two diversity-centered committee voting rules, namely Single Non-Transferable Voting Rule and Chamberlin-Courant Rule, into new RL algorithms that demonstrate excellent exploratory behavior in our experiments.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

1902.10646

Country:

North America > United States > Michigan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Saarland > Saarbrücken (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Osogami, Takayuki, Takahashi, Toshihiro

Real-time tree search with pessimistic scenarios

arXiv.org Artificial IntelligenceFeb-27-2019

Autonomous agents, such as self-driving cars and drones, need to make decisions in real time, which is particularly important but difficult in critical situations for example to avoid collisions. Such decisions often need to be made in a sequential manner to achieve the eventual goal (e.g., avoiding collisions and recovering to safe conditions), under partially observable environment, and by taking into account how other agents behave. Towards this far-reaching goal of realizing such autonomous agents, we propose practical techniques of sequential decision making in real time and demonstrate their effectiveness in Pommerman, a multi-agent environment that has been used in one of the competitions held at the Thirty-second Conference on Neural Information Processing Systems (NeurIPS 2018) on Dec. 8, 2018 Resnick et al. [2018a]. The techniques that we propose in this paper have been used in the Pommerman agents (HakozakiJunctions and dypm-final) who have won the first and third places in the competition. In Pommerman, a team of two agents competes against another team of two agents on a board of 11 11 grids (see Figure 1 (a) for an initial configuration of the board). Each agent can observe only a limited area of the board, and the agents cannot communicate with each other. The goal of a team is to knock down all of the opponents. Towards this goal, the agents place bombs to destroy wooden walls and collect power-up items that might appear from those wooden walls, while avoiding flames and attacking opponents. See Figure 1 (b) for an example of the board in the middle of the game.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

1902.1087

Country: Asia > Japan (0.04)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Serrano, Chris R., Warren, Michael A.

Introspection Learning

arXiv.org Machine LearningFeb-27-2019

Traditional reinforcement learning agents learn from experience, past or present, gained through interaction with their environment. Our approach synthesizes experience, without requiring an agent to interact with their environment, by asking the policy directly "Are there situations X, Y, and Z, such that in these situations you would select actions A, B, and C?" In this paper we present Introspection Learning, an algorithm that allows for the asking of these types of questions of neural network policies. Introspection Learning is reinforcement learning algorithm agnostic and the states returned may be used as an indicator of the health of the policy or to shape the policy in a myriad of ways. We demonstrate the usefulness of this algorithm both in the context of speeding up training and improving robustness with respect to safety constraints.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

1902.10754

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningFeb-27-2019

Distributed Edge Caching via Reinforcement Learning in Fog Radio Access Networks

Lu, Liuyang, Jiang, Yanxiang, Bennis, Mehdi, Ding, Zhiguo, Zheng, Fu-Chun, You, Xiaohu

In this paper, the distributed edge caching problem in fog radio access networks (F-RANs) is investigated. By considering the unknown spatio-temporal content popularity and user preference, a user request model based on hidden Markov process is proposed to characterize the fluctuant spatio-temporal traffic demands in F-RANs. Then, the Q-learning method based on the reinforcement learning (RL) framework is put forth to seek the optimal caching policy in a distributed manner, which enables fog access points (F-APs) to learn and track the potential dynamic process without extra communications cost. Furthermore, we propose a more efficient Q-learning method with value function approximation (Q-VFA-learning) to reduce complexity and accelerate convergence. Simulation results show that the performance of our proposed method is superior to those of the traditional methods.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

1902.10574

Country: Asia > China > Jiangsu Province (0.14)

Genre: Research Report (0.70)

Industry: Telecommunications (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

arXiv.org Machine LearningFeb-26-2019

Diagnosing Bottlenecks in Deep Q-learning Algorithms

Fu, Justin, Kumar, Aviral, Soh, Matthew, Levine, Sergey

Q-learning methods represent a commonly used class of algorithms in reinforcement learning: they are generally efficient and simple, and can be combined readily with function approximators for deep reinforcement learning (RL). However, the behavior of Q-learning methods with function approximation is poorly understood, both theoretically and empirically. In this work, we aim to experimentally investigate potential issues in Q-learning, by means of a "unit testing" framework where we can utilize oracles to disentangle sources of error. Specifically, we investigate questions related to function approximation, sampling error and nonstationarity, and where available, verify if trends found in oracle settings hold true with modern deep RL methods. We find that large neural network architectures have many benefits with regards to learning stability; offer several practical compensations for overfitting; and develop a novel sampling method based on explicitly compensating for function approximation error that yields fair improvement on high-dimensional continuous control domains.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

1902.1025

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)