AITopics | Undirected Networks

Collaborating Authors

Undirected Networks

News Overviews Instructional Materials AI-Alerts Classics

Dependability Analysis of Deep Reinforcement Learning based Robotics and Autonomous Systems

arXiv.org Artificial IntelligenceSep-14-2021

While Deep Reinforcement Learning (DRL) provides transformational capabilities to the control of Robotics and Autonomous Systems (RAS), the black-box nature of DRL and uncertain deployment-environments of RAS pose new challenges on its dependability. Although there are many existing works imposing constraints on the DRL policy to ensure a successful completion of the mission, it is far from adequate in terms of assessing the DRL-driven RAS in a holistic way considering all dependability properties. In this paper, we formally define a set of dependability properties in temporal logic and construct a Discrete-Time Markov Chain (DTMC) to model the dynamics of risk/failures of a DRL-driven RAS interacting with the stochastic environment. We then do Probabilistic Model Checking based on the designed DTMC to verify those properties. Our experimental results show that the proposed method is effective as a holistic assessment framework, while uncovers conflicts between the properties that may need trade-offs in the training. Moreover, we find the standard DRL training cannot improve dependability properties, thus requiring bespoke optimisation objectives concerning them. Finally, our method offers a novel dependability analysis to the Sim-to-Real challenge of DRL.

dependability property, dtmc, robot, (13 more...)

arXiv.org Artificial Intelligence

2109.06523

Country:

South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(4 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Energy (0.68)
Transportation (0.66)
Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Deep hierarchical reinforcement agents for automated penetration testing

Tran, Khuong, Akella, Ashlesha, Standen, Maxwell, Kim, Junae, Bowman, David, Richer, Toby, Lin, Chin-Teng

arXiv.org Artificial IntelligenceSep-14-2021

Penetration testing the organised attack of a computer system in order to test existing defences has been used extensively to evaluate network security. This is a time consuming process and requires in-depth knowledge for the establishment of a strategy that resembles a real cyber-attack. This paper presents a novel deep reinforcement learning architecture with hierarchically structured agents called HA-DRL, which employs an algebraic action decomposition strategy to address the large discrete action space of an autonomous penetration testing simulator where the number of actions is exponentially increased with the complexity of the designed cybersecurity network. The proposed architecture is shown to find the optimal attacking policy faster and more stably than a conventional deep Q-learning agent which is commonly used as a method to apply artificial intelligence in automatic penetration testing.

action space, agent, scenario, (17 more...)

arXiv.org Artificial Intelligence

2109.06449

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

On Solving a Stochastic Shortest-Path Markov Decision Process as Probabilistic Inference

Baioumy, Mohamed, Lacerda, Bruno, Duckworth, Paul, Hawes, Nick

arXiv.org Artificial IntelligenceSep-13-2021

We propose solving the general Stochastic Shortest-Path Markov Decision Process (SSP MDP) as probabilistic inference. Furthermore, we discuss online and offline methods for planning under uncertainty. In an SSP MDP, the horizon is indefinite and unknown a priori. SSP MDPs generalize finite and infinite horizon MDPs and are widely used in the artificial intelligence community. Additionally, we highlight some of the differences between solving an MDP using dynamic programming approaches widely used in the artificial intelligence community and approaches used in the active inference community.

artificial intelligence, inference, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2109.05866

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.54)
Government > Military (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Recommendation Fairness: From Static to Dynamic

Zhang, Dell, Wang, Jun

arXiv.org Artificial IntelligenceSep-13-2021

Driven by the need to capture users' evolving interests and optimize their long-term experiences, more and more recommender systems have started to model recommendation as a Markov decision process and employ reinforcement learning to address the problem. Shouldn't research on the fairness of recommender systems follow the same trend from static evaluation and one-shot intervention to dynamic monitoring and non-stop control? In this paper, we portray the recent developments in recommender systems first and then discuss how fairness could be baked into the reinforcement learning techniques for recommendation. Moreover, we argue that in order to make further progress in recommendation fairness, we may want to consider multi-agent (game-theoretic) optimization, multi-objective (Pareto) optimization, and simulation-based optimization, in the general framework of stochastic games.

computing machinery, new york, proceedings, (11 more...)

arXiv.org Artificial Intelligence

2109.0315

Country:

North America > United States > New York > New York County > New York City (0.11)
Europe > Netherlands > North Holland > Amsterdam (0.05)
Asia > Myanmar > Tanintharyi Region > Dawei (0.05)
(3 more...)

Genre:

Research Report (0.64)
Instructional Material (0.46)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Beginners Guide to Boltzmann Machine

#artificialintelligenceSep-12-2021, 21:04:29 GMT

Deep learning implements structured machine learning algorithms by making use of artificial neural networks. These algorithms help the machine to learn by itself and develop the ability to establish new parameters with which help to make and execute decisions. Deep learning is considered to be a subset of machine learning and utilizes multi-layered artificial neural networks to carry out its processes, which enables it to deliver high accuracy in tasks such as speech recognition, object detection, language translation and other such modern use cases being implemented every day. One of the most intriguing implementations in the domain of artificial intelligence for creating deep learning models has been the Boltzmann Machine. In this article, we will try to understand what exactly a Boltzmann Machine is, how it can be implemented and its uses.

boltzmann machine, neural network, restricted boltzmann machine, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.99)

Add feedback

Concave Utility Reinforcement Learning with Zero-Constraint Violations

Agarwal, Mridul, Bai, Qinbo, Aggarwal, Vaneet

arXiv.org Artificial IntelligenceSep-12-2021

We consider the problem of tabular infinite horizon concave utility reinforcement learning (CURL) with convex constraints. Various learning applications with constraints, such as robotics, do not allow for policies that can violate constraints. To this end, we propose a model-based learning algorithm that achieves zero constraint violations. To obtain this result, we assume that the concave objective and the convex constraints have a solution interior to the set of feasible occupation measures. We then solve a tighter optimization problem to ensure that the constraints are never violated despite the imprecise model knowledge and model stochasticity. We also propose a novel Bellman error based analysis for tabular infinite-horizon setups which allows to analyse stochastic policies. Combining the Bellman error based analysis and tighter optimization equation, for $T$ interactions with the environment, we obtain a regret guarantee for objective which grows as $\Tilde{O}(1/\sqrt{T})$, excluding other factors.

algorithm, constraint, equation, (14 more...)

arXiv.org Artificial Intelligence

2109.05439

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Improved Algorithms for Misspecified Linear Markov Decision Processes

Vial, Daniel, Parulekar, Advait, Shakkottai, Sanjay, Srikant, R.

arXiv.org Machine LearningSep-12-2021

Due to the large (possibly infinite) state spaces of modern reinforcement learning applications, practical algorithms must generalize across states. To understand generalization on a theoretical level, recent work has studied linear Markov decision processes (LMDPs), among other models (see Section 1.2 for related work). The LMDP model assumes the next-state distribution and reward are linear in known d-dimensional features, which enables tractable generalization when d is small. Of course, this linear assumption most likely fails in practice, which motivates the misspecified LMDP (MLMDP) model.

algorithm, algorithm 1, mis, (15 more...)

arXiv.org Machine Learning

2109.05546

Country:

North America > United States > Illinois (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.60)

Add feedback

AdaK-NER: An Adaptive Top-K Approach for Named Entity Recognition with Incomplete Annotations

Ruan, Hongtao, Zheng, Liying, Hu, Peixian, Xu, Liang, Xiao, Jing

arXiv.org Artificial IntelligenceSep-11-2021

State-of-the-art Named Entity Recognition(NER) models rely heavily on large amountsof fully annotated training data. However, ac-cessible data are often incompletely annotatedsince the annotators usually lack comprehen-sive knowledge in the target domain. Normallythe unannotated tokens are regarded as non-entities by default, while we underline thatthese tokens could either be non-entities orpart of any entity. Here, we study NER mod-eling with incomplete annotated data whereonly a fraction of the named entities are la-beled, and the unlabeled tokens are equiva-lently multi-labeled by every possible label.Taking multi-labeled tokens into account, thenumerous possible paths can distract the train-ing model from the gold path (ground truthlabel sequence), and thus hinders the learn-ing ability. In this paper, we propose AdaK-NER, named the adaptive top-Kapproach, tohelp the model focus on a smaller feasible re-gion where the gold path is more likely to belocated. We demonstrate the superiority ofour approach through extensive experimentson both English and Chinese datasets, aver-agely improving 2% in F-score on the CoNLL-2003 and over 10% on two Chinese datasetscompared with the prior state-of-the-art works.

dataset, possible path, probability, (15 more...)

arXiv.org Artificial Intelligence

2109.05233

Country:

North America > Barbados (0.04)
Asia > Middle East > Bahrain (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Simultaneous Perception-Action Design via Invariant Finite Belief Sets

Hibbard, Michael, Tanaka, Takashi, Topcu, Ufuk

arXiv.org Artificial IntelligenceSep-10-2021

Although perception is an increasingly dominant portion of the overall computational cost for autonomous systems, only a fraction of the information perceived is likely to be relevant to the current task. To alleviate these perception costs, we develop a novel simultaneous perception-action design framework wherein an agent senses only the task-relevant information. This formulation differs from that of a partially observable Markov decision process, since the agent is free to synthesize not only its policy for action selection but also its belief-dependent observation function. The method enables the agent to balance its perception costs with those incurred by operating in its environment. To obtain a computationally tractable solution, we approximate the value function using a novel method of invariant finite belief sets, wherein the agent acts exclusively on a finite subset of the continuous belief space. We solve the approximate problem through value iteration in which a linear program is solved individually for each belief state in the set, in each iteration. Finally, we prove that the value functions, under an assumption on their structure, converge to their continuous state-space values as the sample density increases.

agent, belief state, information, (16 more...)

arXiv.org Artificial Intelligence

2109.05073

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.70)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Event-Based Communication in Multi-Agent Distributed Q-Learning

Ornia, Daniel Jarne, Mazo, Manuel Jr

arXiv.org Artificial IntelligenceSep-9-2021

We present in this work an approach to reduce the communication of information needed on a multi-agent learning system inspired by Event Triggered Control (ETC) techniques. We consider a baseline scenario of a distributed Q-learning problem on a Markov Decision Process (MDP). Following an event-based approach, N agents explore the MDP and communicate experiences to a central learner only when necessary, which performs updates of the actor Q functions. We analyse the convergence guarantees retained with respect to a regular Q-learning algorithm, and present experimental results showing that event-based communication results in a substantial reduction of data transmission rates in such distributed systems. Additionally, we discuss what effects (desired and undesired) these event-based approaches have on the learning processes studied, and how they can be applied to more complex multi-agent learning systems.

agent, mdp, reinforcement learning, (9 more...)

arXiv.org Artificial Intelligence

2109.01417

Country:

Europe > Netherlands > South Holland > Delft (0.05)
North America > United States > Virginia (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback