AITopics

2102.07495

Country:

North America > United States > Texas (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > Michigan (0.04)

Genre:

Overview (0.46)
Research Report (0.40)

Industry: Leisure & Entertainment > Games > Bridge (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Vlastelica, Marin, Rolínek, Michal, Martius, Georg

Neuro-algorithmic Policies enable Fast Combinatorial Generalization

arXiv.org Artificial IntelligenceFeb-15-2021

Although model-based and model-free approaches to learning the control of systems have achieved impressive results on standard benchmarks, generalization to task variations is still lacking. Recent results suggest that generalization for standard architectures improves only after obtaining exhaustive amounts of data. We give evidence that generalization capabilities are in many cases bottlenecked by the inability to generalize on the combinatorial aspects of the problem. Furthermore, we show that for a certain subclass of the MDP framework, this can be alleviated by neuro-algorithmic architectures. Many control problems require long-term planning that is hard to solve generically with neural networks alone. We introduce a neuro-algorithmic policy architecture consisting of a neural network and an embedded time-dependent shortest path solver. These policies can be trained end-to-end by blackbox differentiation. We show that this type of architecture generalizes well to unseen variations in the environment already after seeing a few examples.

algorithm, architecture, international conference, (13 more...)

2102.07456

Country:

Europe > Germany > Baden-Württemberg > Freiburg (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Machine LearningFeb-14-2021

Nearly Minimax Optimal Regret for Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation

Wu, Yue, Zhou, Dongruo, Gu, Quanquan

We study reinforcement learning in an infinite-horizon average-reward setting with linear function approximation, where the transition probability function of the underlying Markov Decision Process (MDP) admits a linear form over a feature mapping of the current state, action, and next state. We propose a new algorithm UCRL2-VTR, which can be seen as an extension of the UCRL2 algorithm with linear function approximation. We show that UCRL2-VTR with Bernstein-type bonus can achieve a regret of $\tilde{O}(d\sqrt{DT})$, where $d$ is the dimension of the feature mapping, $T$ is the horizon, and $\sqrt{D}$ is the diameter of the MDP. We also prove a matching lower bound $\tilde{\Omega}(d\sqrt{DT})$, which suggests that the proposed UCRL2-VTR is minimax optimal up to logarithmic factors. To the best of our knowledge, our algorithm is the first nearly minimax optimal RL algorithm with function approximation in the infinite-horizon average-reward setting.

algorithm, inequality, mdp, (13 more...)

arXiv.org Machine Learning

2102.07301

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

Cohen, Samuel, Kumar, K S Sesh, Deisenroth, Marc Peter

Sliced Multi-Marginal Optimal Transport

arXiv.org Machine LearningFeb-14-2021

We study multi-marginal optimal transport, a generalization of optimal transport that allows us to define discrepancies between multiple measures. It provides a framework to solve multi-task learning problems and to perform barycentric averaging. However, multi-marginal distances between multiple measures are typically challenging to compute because they require estimating a transport plan with $N^P$ variables. In this paper, we address this issue in the following way: 1) we efficiently solve the one-dimensional multi-marginal Monge-Wasserstein problem for a classical cost function in closed form, and 2) we propose a higher-dimensional multi-marginal discrepancy via slicing and study its generalized metric properties. We show that computing the sliced multi-marginal discrepancy is massively scalable for a large number of probability measures with support as large as $10^7$ samples. Our approach can be applied to solving problems such as barycentric averaging, multi-task density estimation and multi-task reinforcement learning.

dimension, multi-marginal distance, smw, (12 more...)

arXiv.org Machine Learning

2102.07115

Country:

North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.35)

Singh, Jaskirat, Zheng, Liang

Sparse Attention Guided Dynamic Value Estimation for Single-Task Multi-Scene Reinforcement Learning

Training deep reinforcement learning agents on environments with multiple levels / scenes from the same task, has become essential for many applications aiming to achieve generalization and domain transfer from simulation to the real world. While such a strategy is helpful with generalization, the use of multiple scenes significantly increases the variance of samples collected for policy gradient computations. Current methods, effectively continue to view this collection of scenes as a single Markov decision process (MDP), and thus learn a scene-generic value function V(s). However, we argue that the sample variance for a multi-scene environment is best minimized by treating each scene as a distinct MDP, and then learning a joint value function V(s,M) dependent on both state s and MDP M. We further demonstrate that the true joint value function for a multi-scene environment, follows a multi-modal distribution which is not captured by traditional CNN / LSTM based critic networks. To this end, we propose a dynamic value estimation (DVE) technique, which approximates the true joint value function through a sparse attention mechanism over multiple value function hypothesis / modes. The resulting agent not only shows significant improvements in the final reward score across a range of OpenAI ProcGen environments, but also exhibits enhanced navigation efficiency and provides an implicit mechanism for unsupervised state-space skill decomposition.

arxiv preprint arxiv, dynamic value estimation, value function, (8 more...)

2102.07266

Country:

Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Uprety, Aashma, Rawat, Danda B.

Reinforcement Learning for IoT Security: A Comprehensive Survey

The number of connected smart devices has been increasing exponentially for different Internet-of-Things (IoT) applications. Security has been a long run challenge in the IoT systems which has many attack vectors, security flaws and vulnerabilities. Securing billions of B connected devices in IoT is a must task to realize the full potential of IoT applications. Recently, researchers have proposed many security solutions for IoT. Machine learning has been proposed as one of the emerging solutions for IoT security and Reinforcement learning is gaining more popularity for securing IoT systems. Reinforcement learning, unlike other machine learning techniques, can learn the environment by having minimum information about the parameters to be learned. It solves the optimization problem by interacting with the environment adapting the parameters on the fly. In this paper, we present an comprehensive survey of different types of cyber-attacks against different IoT systems and then we present reinforcement learning and deep reinforcement learning based security solutions to combat those different types of attacks in different IoT systems. Furthermore, we present the Reinforcement learning for securing CPS systems (i.e., IoT with feedback and control) such as smart grid and smart transportation system. The recent important attacks and countermeasures using reinforcement learning B in IoT are also summarized in the form of tables. With this paper, readers can have a more thorough understanding of IoT security attacks and countermeasures using Reinforcement Learning, as well as research trends in this area.

attacker, information, reinforcement, (14 more...)

doi: 10.1109/JIOT.2020.3040957

2102.07247

Country:

North America > United States > District of Columbia > Washington (0.04)
Asia > China (0.04)

Genre: Research Report (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Smart Houses & Appliances (1.00)
Information Technology > Security & Privacy (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Olowononi, Felix, Rawat, Danda B., Liu, Chunmei

Resilient Machine Learning for Networked Cyber Physical Systems: A Survey for Machine Learning Security to Securing Machine Learning for CPS

Cyber Physical Systems (CPS) are characterized by their ability to integrate the physical and information or cyber worlds. Their deployment in critical infrastructure have demonstrated a potential to transform the world. However, harnessing this potential is limited by their critical nature and the far reaching effects of cyber attacks on human, infrastructure and the environment. An attraction for cyber concerns in CPS rises from the process of sending information from sensors to actuators over the wireless communication medium, thereby widening the attack surface. Traditionally, CPS security has been investigated from the perspective of preventing intruders from gaining access to the system using cryptography and other access control techniques. Most research work have therefore focused on the detection of attacks in CPS. However, in a world of increasing adversaries, it is becoming more difficult to totally prevent CPS from adversarial attacks, hence the need to focus on making CPS resilient. Resilient CPS are designed to withstand disruptions and remain functional despite the operation of adversaries. One of the dominant methodologies explored for building resilient CPS is dependent on machine learning (ML) algorithms. However, rising from recent research in adversarial ML, we posit that ML algorithms for securing CPS must themselves be resilient. This paper is therefore aimed at comprehensively surveying the interactions between resilient CPS using ML and resilient ML when applied in CPS. The paper concludes with a number of research trends and promising future research directions. Furthermore, with this paper, readers can have a thorough understanding of recent advances on ML-based security and securing ML for CPS and countermeasures, as well as research trends in this active research area.

adversarial attack, adversarial example, algorithm, (15 more...)

doi: 10.1109/COMST.2020.3036778

2102.07244

Country:

Asia > China (0.04)
North America > United States > District of Columbia > Washington (0.04)
North America > United States > Arizona (0.04)
(3 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)
Research Report > Promising Solution (0.45)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.92)
Government > Military > Cyberwarfare (0.67)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(4 more...)

Li, Bonnie, François-Lavet, Vincent, Doan, Thang, Pineau, Joelle

Domain Adversarial Reinforcement Learning

We consider the problem of generalization in reinforcement learning where visual aspects of the observations might differ, e.g. when there are different backgrounds or change in contrast, brightness, etc. We assume that our agent has access to only a few of the MDPs from the MDP distribution during training. The performance of the agent is then reported on new unknown test domains drawn from the distribution (e.g. unseen backgrounds). For this "zero-shot RL" task, we enforce invariance of the learned representations to visual domains via a domain adversarial optimization process. We empirically show that this approach allows achieving a significant generalization improvement to new unseen domains.

background, generalization, representation, (15 more...)

2102.07097

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.83)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

#artificialintelligenceFeb-13-2021, 05:26:13 GMT

Why Machine Learning For Machine Learning's Sake Is A Bad Idea?

Today, businesses are increasingly reliant on artificial intelligence and machine learning to solve critical problems. However, dealing with immense data complexities along with the pressure of having to provide rapid results, could be crippling. Most companies find building an ML-savvy framework quite overwhelming. In an engaging session at MLDS 2021, Sayanti Bhattacharya, Senior Manager, and Ashwin Pai, Manager at Ugam, a Merkle Company, addressed how businesses can apply machine learning to drive results. Machine learning has become such a fashion statement that, more often than not, businesses jump the gun by implementing ML in a hurry, defying logic.

application, bhattacharya, machine learning, (11 more...)

#artificialintelligence

Country: North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.32)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.32)

arXiv.org Machine LearningFeb-13-2021

Model-free Representation Learning and Exploration in Low-rank MDPs

Modi, Aditya, Chen, Jinglin, Krishnamurthy, Akshay, Jiang, Nan, Agarwal, Alekh

The low rank MDP has emerged as an important model for studying representation learning and exploration in reinforcement learning. With a known representation, several model-free exploration strategies exist. In contrast, all algorithms for the unknown representation setting are model-based, thereby requiring the ability to model the full dynamics. In this work, we present the first model-free representation learning algorithms for low rank MDPs. The key algorithmic contribution is a new minimax representation learning objective, for which we provide variants with differing tradeoffs in their statistical and computational properties. We interleave this representation learning step with an exploration strategy to cover the state space in a reward-free manner. The resulting algorithms are provably sample efficient and can accommodate general function approximation to scale to complex environments.

log 2, probability, representation, (16 more...)

arXiv.org Machine Learning

2102.07035

Country:

North America > United States > Illinois (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Michigan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report (0.50)
Workflow (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)