Goto

Collaborating Authors

 Markov Models


Optimal Attacks on Reinforcement Learning Policies

arXiv.org Machine Learning

Control policies, trained using the Deep Reinforcement Learning, have been recently shown to be vulnerable to adversarial attacks introducing even very small perturbations to the policy input. The attacks proposed so far have been designed using heuristics, and build on existing adversarial example crafting techniques used to dupe classifiers in supervised learning. In contrast, this paper investigates the problem of devising optimal attacks, depending on a well-defined attacker's objective, e.g., to minimize the main agent average reward. When the policy and the system dynamics, as well as rewards, are known to the attacker, a scenario referred to as a white-box attack, designing optimal attacks amounts to solving a Markov Decision Process. For what we call black-box attacks, where neither the policy nor the system is known, optimal attacks can be trained using Reinforcement Learning techniques. Through numerical experiments, we demonstrate the efficiency of our attacks compared to existing attacks (usually based on Gradient methods). We further quantify the potential impact of attacks and establish its connection to the smoothness of the policy under attack. Smooth policies are naturally less prone to attacks (this explains why Lipschitz policies, with respect to the state, are more resilient). Finally, we show that from the main agent perspective, the system uncertainties and the attacker can be modeled as a Partially Observable Markov Decision Process. We actually demonstrate that using Reinforcement Learning techniques tailored to POMDP (e.g. using Recurrent Neural Networks) leads to more resilient policies.


Bridging Commonsense Reasoning and Probabilistic Planning via a Probabilistic Action Language

arXiv.org Artificial Intelligence

To be responsive to dynamically changing real-world environments, an intelligent agent needs to perform complex sequential decision-making tasks that are often guided by commonsense knowledge. The previous work on this line of research led to the framework called "interleaved commonsense reasoning and probabilistic planning" (icorpp), which used P-log for representing commmonsense knowledge and Markov Decision Processes (MDPs) or Partially Observable MDPs (POMDPs) for planning under uncertainty. A main limitation of icorpp is that its implementation requires non-trivial engineering efforts to bridge the commonsense reasoning and probabilistic planning formalisms. In this paper, we present a unified framework to integrate icorpp's reasoning and planning components. In particular, we extend probabilistic action language pBC+ to express utility, belief states, and observation as in POMDP models. Inheriting the advantages of action languages, the new action language provides an elaboration tolerant representation of POMDP that reflects commonsense knowledge. The idea led to the design of the system pbcplus2pomdp, which compiles a pBC+ action description into a POMDP model that can be directly processed by off-the-shelf POMDP solvers to compute an optimal policy of the pBC+ action description. Our experiments show that it retains the advantages of icorpp while avoiding the manual efforts in bridging the commonsense reasoner and the probabilistic planner.


Marine Mammal Species Classification using Convolutional Neural Networks and a Novel Acoustic Representation

arXiv.org Machine Learning

Research into automated systems for detecting and classifying marine mammals in acoustic recordings is expanding internationally due to the necessity to analyze large collections of data for conservation purposes. In this work, we present a Convolutional Neural Network that is capable of classifying the vocalizations of three species of whales, non-biological sources of noise, and a fifth class pertaining to ambient noise. In this way, the classifier is capable of detecting the presence and absence of whale vocalizations in an acoustic recording. Through transfer learning, we show that the classifier is capable of learning high-level representations and can generalize to additional species. We also propose a novel representation of acoustic signals that builds upon the commonly used spectrogram representation by way of interpolating and stacking multiple spectrograms produced using different Short-time Fourier Transform (STFT) parameters. The proposed representation is particularly effective for the task of marine mammal species classification where the acoustic events we are attempting to classify are sensitive to the parameters of the STFT.


Deep Learning in Video Multi-Object Tracking: A Survey

arXiv.org Machine Learning

The problem of Multiple Object Tracking (MOT) consists in following the trajectory of different objects in a sequence, usually a video. In recent years, with the rise of Deep Learning, the algorithms that provide a solution to this problem have benefited from the representational power of deep models. This paper provides a comprehensive survey on works that employ Deep Learning models to solve the task of MOT on single-camera videos. Four main steps in MOT algorithms are identified, and an in-depth review of how Deep Learning was employed in each one of these stages is presented. A complete experimental comparison of the presented works on the three MOTChallenge datasets is also provided, identifying a number of similarities among the top-performing methods and presenting some possible future research directions.


Towards a Theory of Intentions for Human-Robot Collaboration

arXiv.org Artificial Intelligence

The architecture described in this paper encodes a theory of intentions based on the the key principles of non-procrastination, persistence, and automatically limiting reasoning to relevant knowledge and observations. The architecture reasons with transition diagrams of any given domain at two different resolutions, with the fine-resolution description defined as a refinement of, and hence tightly-coupled to, a coarse-resolution description. Non-monotonic logical reasoning with the coarse-resolution description computes an activity (i.e., plan) comprising abstract actions for any given goal. Each abstract action is implemented as a sequence of concrete actions by automatically zooming to and reasoning with the part of the fine-resolution transition diagram relevant to the current coarse-resolution transition and the goal. Each concrete action in this sequence is executed using probabilistic models of the uncertainty in sensing and actuation, and the corresponding fine-resolution outcomes are used to infer coarse-resolution observations that are added to the coarse-resolution history. The architecture's capabilities are evaluated in the context of a simulated robot assisting humans in an office domain, on a physical robot (Baxter) manipulating tabletop objects, and on a wheeled robot (Turtlebot) moving objects to particular places or people. The experimental results indicate improvements in reliability and computational efficiency compared with an architecture that does not include the theory of intentions, and an architecture that does not include zooming for fine-resolution reasoning.


A Mathematical Model for Linguistic Universals

arXiv.org Artificial Intelligence

W e present a Markov model at the discourse level for Steven Pinker's "mentalese", or chains of mental states that transcend the spoken/written forms. Such (potentially) universal temporal structures of textual pa tterns lead us to a language-independent semantic representation, or a translationally-invariant word embe dding, thereby forming the common ground for both comprehensibility within a given language and transla tability between different languages. Applying our model to documents of moderate lengths, without relying on external knowledge bases, we reconcile Noam Chomsky's "poverty of stimulus" paradox with statisti cal learning of natural languages. W e human beings distinguish ourselves from other animals ( 1-3), in that our brain development ( 4-6) enables us to convey sophisticated ideas and to share individual experience s, via languages ( 7-9). Texts written in natural languages constitute a major medium that perpetuates our civilizations ( 10), as a cumulative body of knowledge.


Speech Recognition using Artificial Neural Network (ANN)

#artificialintelligence

Speech is the way of communication between people. The speech recognition is a software invention which converts our spoken language into a machine-readable format. Nowadays speech recognition is useful for interaction between human and machines or mobile devices. So, it is very important. Speech recognition is mainly divided into two parts.


Context Model for Pedestrian Intention Prediction using Factored Latent-Dynamic Conditional Random Fields

arXiv.org Machine Learning

--Smooth handling of pedestrian interactions is a key requirement for Autonomous V ehicles (A V) and Advanced Driver Assistance Systems (ADAS). Such systems call for early and accurate prediction of a pedestrian's crossing/not-crossing behaviour in front of the vehicle. We stress on the necessity of early prediction for smooth operation of such systems. We introduce the influence of vehicle interactions on pedestrian intention for this purpose. In this paper, we show a discernible advance in prediction time aided by the inclusion of such vehicle interaction context. We apply our methods to two different datasets, one in-house collected - NTU dataset and another public real-life benchmark - JAAD dataset. We also propose a generic graphical model Factored Latent-Dynamic Conditional Random Fields (FLDCRF) for single and multi-label sequence prediction as well as joint interaction modeling tasks. While the existing best system predicts pedestrian stopping behaviour with 70% accuracy 0.38 seconds before the actual events, our system achieves such accuracy at least 0.9 seconds on an average before the actual events across datasets. Personal use of this material is permitted. S we enter the era of autonomous driving with the first ever self-driving taxi launched in December 2018, smooth handling of pedestrian interactions still remains a challenge. The tradeoff is between on-road pedestrian safety and smoothness of the ride. Recent user experiences and available online footage suggest conservative autonomous rides resulting from the emphasis on on-road pedestrian safety . T o achieve rapid user adoption, the A Vs must be able to simulate a smooth human driver-like experience without unnecessary interruptions, in addition to ensuring 100% pedestrian safety . Automated braking systems in an ADAS tackle the emergency pedestrian interactions. These brakes get activated on detecting pedestrians' crossing behaviours within the vehicle safety range. A future ADAS must be able of offer a smoother experience on such interactions. The key to a safe and smooth autonomous pedestrian interaction lies in early and accurate prediction of a pedestrian's crossing/not-crossing behaviour in front of the vehicle. Accurate and timely prediction of pedestrian behaviour ensures on-road pedestrian safety, while early anticipation of the crossing/not-crossing behaviour offers more path planning time and consequently a smoother control over the vehicle dynamics. Recent works on on-road pedestrian behaviour prediction ([1] - [15]) rely on a pedestrian's motion, skeletal pose, his/her location in scene (on road, at curb etc.) and certain static context variables (e.g., presence of zebra crossings, traffic lights etc.).


Deep Reinforcement Learning for Personalized Search Story Recommendation

arXiv.org Machine Learning

ABSTRACT In recent years, search story, a combined display with other organic channels, has become a major source of user traffic on platforms such as e-commerce search platforms, news feed platforms and web and image search platforms. The recommended search story guides a user to identify her own preference and personal intent, which subsequently influences the user's real-time and long-term search behavior. As search stories become increasingly important, in this work, we study the problem of personalized search story recommendation within a search engine, which aims to suggest a search story relevant to both a search keyword and an individual user's interest. To address the challenge of modeling both immediate and future values of recommended search stories (i.e., cross-channel effect), for which conventional supervised learning framework is not applicable, we resort to a Markov decision process and propose a deep reinforcement learning architecture trained by both imitation learning and reinforcement learning. We empirically demonstrate the effectiveness of our proposed approach through extensive experiments on real-world data sets from JD.com. 1. INTRODUCTION Imagine that a customer visits a retail shop to purchase a dress which is to her liking. As the customer walks in, a business assistant is present to assist the customer by answering questions on fashion trend or suggesting related dresses. In online e-commerce applications, more business units are adding a component that plays a similar role as the business assistant in a shop. In this paper, we are interested in a particular component, commonly known as search story, that has become popular among e-commerce search engines on many online platforms. For instance, in news feed platforms and web and image search platforms, each search story is a display of recommended high-quality content which is relevant to a user's personal interests. In e-commerce search (a) Display search story within organic product item search page (b) Landing page after clicking search story, which contains both shopping guides and shopping product items Figure 1: An illustrated (not a screenshot) example of search story recommendation.


Training products of expert capsules with mixing by dynamic routing

arXiv.org Machine Learning

This study develops an unsupervised learning algorithm for products of expert capsules with dynamic routing. Analogous to binary-valued neurons in Restricted Boltzmann Machines, the magnitude of a squashed capsule firing takes values between zero and one, representing the probability of the capsule being on. This analogy motivates the design of an energy function for capsule networks. In order to have an efficient sampling procedure where hidden layer nodes are not connected, the energy function is made consistent with dynamic routing in the sense of the probability of a capsule firing, and inference on the capsule network is computed with the dynamic routing between capsules procedure. In order to optimize the log-likelihood of the visible layer capsules, the gradient is found in terms of this energy function. The developed unsupervised learning algorithm is used to train a capsule network on standard vision datasets, and is able to generate realistic looking images from its learned distribution.