AITopics

There have been many studies on automated test generation for mobile Graphical User Interface (GUI) applications. These studies successfully demonstrate how to detect fatal exceptions and achieve high code and activity coverage with fully automated test generation engines. However, it is unclear how many GUI functions these engines manage to test. Furthermore, these engines implement only implicit test oracles. We propose Fully Automated Reinforcement LEArning-Driven Specification-Based Test Generator for Android (FARLEAD-Android). FARLEAD-Android accepts a GUI-level formal specification as a Linear-time Temporal Logic (LTL) formula. By dynamically executing the Application Under Test (AUT), it learns how to generate a test that satisfies the LTL formula using Reinforcement Learning (RL). The LTL formula does not just guide the test generation but also acts as a specified test oracle, enabling the developer to define automated test oracles for a wide variety of GUI functions by changing the formula. Our evaluation shows that FARLEAD-Android is more effective and achieves higher performance in generating tests for specified GUI functions than three known approaches, Random, Monkey, and QBEa. To the best of our knowledge, FARLEAD-Android is the first fully automated mobile GUI testing engine that uses formal specifications.

application, farlead-android, specification, (13 more...)

1911.05403

Country:

Europe > Austria > Vienna (0.14)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
(2 more...)

Genre:

Instructional Material > Course Syllabus & Notes (0.48)
Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment (0.68)
Transportation (0.46)

Technology:

Information Technology > Graphics (1.00)
Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Jain, Vishal, Fedus, William, Larochelle, Hugo, Precup, Doina, Bellemare, Marc G.

Algorithmic Improvements for Deep Reinforcement Learning applied to Interactive Fiction

Text-based games are a natural challenge domain for deep reinforcement learning algorithms. Their state and action spaces are combinatorially large, their reward function is sparse, and they are partially observable: the agent is informed of the consequences of its actions through textual feedback. In this paper we emphasize this latter point and consider the design of a deep reinforcement learning agent that can play from feedback alone. Our design recognizes and takes advantage of the structural characteristics of text-based games. We first propose a contextualisation mechanism, based on accumulated reward, which simplifies the learning problem and mitigates partial observability. We then study different methods that rely on the notion that most actions are ineffectual in any given situation, following Zahavy et al.'s idea of an admissible action. We evaluate these techniques in a series of text-based games of increasing difficulty based on the TextWorld framework, as well as the iconic game Zork. Empirically, we find that these techniques improve the performance of a baseline deep reinforcement learning agent applied to text-based games.

agent, score contextualisation, subtask, (12 more...)

1911.12511

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > Quebec > Montreal (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Leisure & Entertainment > Games (0.46)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Leurent, Edouard, Mercat, Jean

Social Attention for Autonomous Decision-Making in Dense Traffic

arXiv.org Machine LearningNov-27-2019

We study the design of learning architectures for behavioural planning in a dense traffic setting. Such architectures should deal with a varying number of nearby vehicles, be invariant to the ordering chosen to describe them, while staying accurate and compact. We observe that the two most popular representations in the literature do not fit these criteria, and perform badly on an complex negotiation task. We propose an attention-based architecture that satisfies all these properties and explicitly accounts for the existing interactions between the traffic participants. We show that this architecture leads to significant performance gains, and is able to capture interactions patterns that can be visualised and qualitatively interpreted. Videos and code are available at https://eleurent.github.io/social-attention/.

international conference, representation, vehicle, (14 more...)

arXiv.org Machine Learning

1911.1225

Country:

North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
North America > Canada (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.82)

Industry:

Automobiles & Trucks (0.95)
Transportation > Ground > Road (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.72)

Atkinson, Craig, McCane, Brendan, Szymanski, Lech, Robins, Anthony

GRIm-RePR: Prioritising Generating Important Features for Pseudo-Rehearsal

arXiv.org Machine LearningNov-27-2019

Pseudo-rehearsal allows neural networks to learn a sequence of tasks without forgetting how to perform in earlier tasks. Preventing forgetting is achieved by introducing a generative network which can produce data from previously seen tasks so that it can be rehearsed along side learning the new task. This has been found to be effective in both supervised and reinforcement learning. Our current work aims to further prevent forgetting by encouraging the generator to accurately generate features important for task retention. More specifically, the generator is improved by introducing a second discriminator into the Generative Adversarial Network which learns to classify between real and fake items from the intermediate activation patterns that they produce when fed through a continual learning agent. Using Atari 2600 games, we experimentally find that improving the generator can considerably reduce catastrophic forgetting compared to the standard pseudo-rehearsal methods used in deep reinforcement learning. Furthermore, we propose normalising the Q-values taught to the long-term system as we observe this substantially reduces catastrophic forgetting by minimising the interference between tasks' reward functions.

generator, information, long-term system, (14 more...)

arXiv.org Machine Learning

1911.11988

Country:

North America > United States (0.14)
Oceania > New Zealand > South Island > Otago > Dunedin (0.04)
Oceania > Australia > Queensland (0.04)
North America > Canada > Ontario > National Capital Region > Ottawa (0.04)

Genre: Research Report > New Finding (0.47)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Ak, Serkan, Bruggenwirth, Stefan

Avoiding Jammers: A Reinforcement Learning Approach

arXiv.org Machine LearningNov-27-2019

This paper investigates the anti-jamming performance of a cognitive radar under a partially observable Markov decision process (POMDP) model. First, we obtain an explicit expression for uncertainty of jammer dynamics, which paves the way for illuminating the performance metric of probability of being jammed for the radar beyond a conventional signal-to-noise ratio ($\mathsf{SNR}$) based analysis. Considering two frequency hopping strategies developed in the framework of reinforcement learning (RL), this performance metric is analyzed with deep Q-network (DQN) and long short term memory (LSTM) networks under various uncertainty values. Finally, the requirement of the target network in the RL algorithm for both network architectures is replaced with a softmax operator. Simulation results show that this operator improves upon the performance of the traditional target network.

jammer, probability, radar, (17 more...)

arXiv.org Machine Learning

1911.08874

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Monterey County > Pacific Grove (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Industry: Telecommunications (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Baranchuk, Dmitry, Babenko, Artem

Towards Similarity Graphs Constructed by Deep Reinforcement Learning

Similarity graphs are an active research direction for the nearest neighbor search (NNS) problem. New algorithms for similarity graph construction are continuously being proposed and analyzed by both theoreticians and practitioners. However, existing construction algorithms are mostly based on heuristics and do not explicitly maximize the target performance measure, i.e., search recall. Therefore, at the moment it is not clear whether the performance of similarity graphs has plateaued or more effective graphs can be constructed with more theoretically grounded methods. In this paper, we introduce a new principled algorithm, based on adjacency matrix optimization, which explicitly maximizes search efficiency. Namely, we propose a probabilistic model of a similarity graph defined in terms of its edge probabilities and show how to learn these probabilities from data as a reinforcement learning task. As confirmed by experiments, the proposed construction method can be used to refine the state-of-the-art similarity graphs, achieving higher recall rates for the same number of distance computations. Furthermore, we analyze the learned graphs and reveal the structural properties that are responsible for more efficient search.

algorithm, graph, vertex, (14 more...)

1911.12122

Country:

Asia > Russia (0.04)
North America > United States (0.04)
North America > Canada > British Columbia > Vancouver Island > Capital Regional District > Victoria (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Eghtesad, Taha, Vorobeychik, Yevgeniy, Laszka, Aron

Deep Reinforcement Learning based Adaptive Moving Target Defense

Moving target defense (MTD) is a proactive defense approach that aims to thwart attacks by continuously changing the attack surface of a system (e.g., changing host or network configurations), thereby increasing the adversary's uncertainty and attack cost. To maximize the impact of MTD, a defender must strategically choose when and what changes to make, taking into account both the characteristics of its system as well as the adversary's observed activities. Finding an optimal strategy for MTD presents a significant challenge, especially when facing a resourceful and determined adversary who may respond to the defender's actions. In this paper, we propose finding optimal MTD strategies using deep reinforcement learning. Based on an established model of adaptive MTD, we formulate finding an MTD strategy as finding a policy for a partially-observable Markov decision process. To significantly improve training performance, we introduce compact memory representations. To demonstrate our approach, we provide thorough numerical results, showing significant improvement over existing strategies.

adversary, defender, server, (12 more...)

1911.11972

Country: North America > United States > Texas > Harris County > Houston (0.04)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Context-aware Active Multi-Step Reinforcement Learning

Chen, Gang, Li, Dingcheng, Xu, Ran

Reinforcement learning has attracted great attention recently, especially policy gradient algorithms, which have been demonstrated on challenging decision making and control tasks. In this paper, we propose an active multi-step TD algorithm with adaptive stepsizes to learn actor and critic. Specifically, our model consists of two components: active stepsize learning and adaptive multi-step TD algorithm. Firstly, we divide the time horizon into chunks and actively select state and action inside each chunk. Then given the selected samples, we propose the adaptive multi-step TD, which generalizes TD($\lambda$), but adaptively switch on/off the backups from future returns of different steps. Particularly, the adaptive multi-step TD introduces a context-aware mechanism, here a binary classifier, which decides whether or not to turn on its future backups based on the context changes. Thus, our model is kind of combination of active learning and multi-step TD algorithm, which has the capacity for learning off-policy without the need of importance sampling. We evaluate our approach on both discrete and continuous space tasks in an off-policy setting respectively, and demonstrate competitive results compared to other reinforcement learning baselines.

algorithm, learning, reinforcement, (15 more...)

1911.04107

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

#artificialintelligenceNov-26-2019, 23:11:52 GMT

Tiny alterations in training data can introduce "backdoors" into machine learning models

In TrojDRL: Trojan Attacks on Deep Reinforcement Learning Agents, a group of Boston University researchers demonstrate an attack on machine learning systems trained with "reinforcement learning" in which ML systems derive solutions to complex problems by iteratively trying multiple solutions. The attack is related to adversarial examples, a class of attacks that involve probing a machine-learning model to find "blind spots" -- very small changes (usually imperceptible to humans) that cause machine learning classifiers' accuracy to shelve off rapidly (for example, a small change to a model of a gun can make an otherwise reliable classifier think it's looking at a helicopter). It's not clear whether it's possible to create a machine learning model that's immune to adversarial examples (the expert I trust most on this told me off the record that they think it's not), but what the researchers behind Trojdrl propose is a method for deliberately introducing adversarial examples by slipping difficult-to-spot changes into training data, which will produce defects in the eventual model that can serve as a "backdoor" that future adversaries can exploit. Training data sets are often ad-hoc in nature; they're so large that it's hard to create version-by-version snapshots, and they're also so prone to mislabeling that researchers are always making changes to them in order to improve their accuracy. All of this suggests that poisoning training data might be easier than it sounds.

algorithm, backdoor, training data, (9 more...)

#artificialintelligence

Industry: Transportation (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

#artificialintelligenceNov-26-2019, 06:07:21 GMT

What Is Constrained Reinforcement Learning And How Can One Build Systems Around It

One of the most important innovations in the present era for the development of highly-advanced AI systems has been the introduction of Reinforcement Learning (RL). It has the potential to solve complex decision-making problems. It generally follows a "trial and error" method to learn optimal policies of a given problem. It has been used to achieve superhuman performance in competitive strategy games, including Go, Starcraft, Dota, among others. Despite the promise shown by reinforcement algorithms in many decision-making problems, there are few glitches and challenges, which still need to be addressed.

constrained reinforcement learning, reinforcement learning, safety gym, (12 more...)

#artificialintelligence

Country: North America > United States > Arizona > Maricopa County > Tempe (0.06)

Industry: Leisure & Entertainment > Games (0.57)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)