Undirected Networks
Towards a Theory of Intentions for Human-Robot Collaboration
Gomez, Rocio, Sridharan, Mohan, Riley, Heather
The architecture described in this paper encodes a theory of intentions based on the the key principles of non-procrastination, persistence, and automatically limiting reasoning to relevant knowledge and observations. The architecture reasons with transition diagrams of any given domain at two different resolutions, with the fine-resolution description defined as a refinement of, and hence tightly-coupled to, a coarse-resolution description. Non-monotonic logical reasoning with the coarse-resolution description computes an activity (i.e., plan) comprising abstract actions for any given goal. Each abstract action is implemented as a sequence of concrete actions by automatically zooming to and reasoning with the part of the fine-resolution transition diagram relevant to the current coarse-resolution transition and the goal. Each concrete action in this sequence is executed using probabilistic models of the uncertainty in sensing and actuation, and the corresponding fine-resolution outcomes are used to infer coarse-resolution observations that are added to the coarse-resolution history. The architecture's capabilities are evaluated in the context of a simulated robot assisting humans in an office domain, on a physical robot (Baxter) manipulating tabletop objects, and on a wheeled robot (Turtlebot) moving objects to particular places or people. The experimental results indicate improvements in reliability and computational efficiency compared with an architecture that does not include the theory of intentions, and an architecture that does not include zooming for fine-resolution reasoning.
A Mathematical Model for Linguistic Universals
W e present a Markov model at the discourse level for Steven Pinker's "mentalese", or chains of mental states that transcend the spoken/written forms. Such (potentially) universal temporal structures of textual pa tterns lead us to a language-independent semantic representation, or a translationally-invariant word embe dding, thereby forming the common ground for both comprehensibility within a given language and transla tability between different languages. Applying our model to documents of moderate lengths, without relying on external knowledge bases, we reconcile Noam Chomsky's "poverty of stimulus" paradox with statisti cal learning of natural languages. W e human beings distinguish ourselves from other animals ( 1-3), in that our brain development ( 4-6) enables us to convey sophisticated ideas and to share individual experience s, via languages ( 7-9). Texts written in natural languages constitute a major medium that perpetuates our civilizations ( 10), as a cumulative body of knowledge.
Speech Recognition using Artificial Neural Network (ANN)
Speech is the way of communication between people. The speech recognition is a software invention which converts our spoken language into a machine-readable format. Nowadays speech recognition is useful for interaction between human and machines or mobile devices. So, it is very important. Speech recognition is mainly divided into two parts.
Context Model for Pedestrian Intention Prediction using Factored Latent-Dynamic Conditional Random Fields
Neogi, Satyajit, Hoy, Michael, Dang, Kang, Yu, Hang, Dauwels, Justin
--Smooth handling of pedestrian interactions is a key requirement for Autonomous V ehicles (A V) and Advanced Driver Assistance Systems (ADAS). Such systems call for early and accurate prediction of a pedestrian's crossing/not-crossing behaviour in front of the vehicle. We stress on the necessity of early prediction for smooth operation of such systems. We introduce the influence of vehicle interactions on pedestrian intention for this purpose. In this paper, we show a discernible advance in prediction time aided by the inclusion of such vehicle interaction context. We apply our methods to two different datasets, one in-house collected - NTU dataset and another public real-life benchmark - JAAD dataset. We also propose a generic graphical model Factored Latent-Dynamic Conditional Random Fields (FLDCRF) for single and multi-label sequence prediction as well as joint interaction modeling tasks. While the existing best system predicts pedestrian stopping behaviour with 70% accuracy 0.38 seconds before the actual events, our system achieves such accuracy at least 0.9 seconds on an average before the actual events across datasets. Personal use of this material is permitted. S we enter the era of autonomous driving with the first ever self-driving taxi launched in December 2018, smooth handling of pedestrian interactions still remains a challenge. The tradeoff is between on-road pedestrian safety and smoothness of the ride. Recent user experiences and available online footage suggest conservative autonomous rides resulting from the emphasis on on-road pedestrian safety . T o achieve rapid user adoption, the A Vs must be able to simulate a smooth human driver-like experience without unnecessary interruptions, in addition to ensuring 100% pedestrian safety . Automated braking systems in an ADAS tackle the emergency pedestrian interactions. These brakes get activated on detecting pedestrians' crossing behaviours within the vehicle safety range. A future ADAS must be able of offer a smoother experience on such interactions. The key to a safe and smooth autonomous pedestrian interaction lies in early and accurate prediction of a pedestrian's crossing/not-crossing behaviour in front of the vehicle. Accurate and timely prediction of pedestrian behaviour ensures on-road pedestrian safety, while early anticipation of the crossing/not-crossing behaviour offers more path planning time and consequently a smoother control over the vehicle dynamics. Recent works on on-road pedestrian behaviour prediction ([1] - [15]) rely on a pedestrian's motion, skeletal pose, his/her location in scene (on road, at curb etc.) and certain static context variables (e.g., presence of zebra crossings, traffic lights etc.).
Deep Reinforcement Learning for Personalized Search Story Recommendation
Jason, null, Zhang, null, Yin, Junming, Lee, Dongwon, Zhu, Linhong
ABSTRACT In recent years, search story, a combined display with other organic channels, has become a major source of user traffic on platforms such as e-commerce search platforms, news feed platforms and web and image search platforms. The recommended search story guides a user to identify her own preference and personal intent, which subsequently influences the user's real-time and long-term search behavior. As search stories become increasingly important, in this work, we study the problem of personalized search story recommendation within a search engine, which aims to suggest a search story relevant to both a search keyword and an individual user's interest. To address the challenge of modeling both immediate and future values of recommended search stories (i.e., cross-channel effect), for which conventional supervised learning framework is not applicable, we resort to a Markov decision process and propose a deep reinforcement learning architecture trained by both imitation learning and reinforcement learning. We empirically demonstrate the effectiveness of our proposed approach through extensive experiments on real-world data sets from JD.com. 1. INTRODUCTION Imagine that a customer visits a retail shop to purchase a dress which is to her liking. As the customer walks in, a business assistant is present to assist the customer by answering questions on fashion trend or suggesting related dresses. In online e-commerce applications, more business units are adding a component that plays a similar role as the business assistant in a shop. In this paper, we are interested in a particular component, commonly known as search story, that has become popular among e-commerce search engines on many online platforms. For instance, in news feed platforms and web and image search platforms, each search story is a display of recommended high-quality content which is relevant to a user's personal interests. In e-commerce search (a) Display search story within organic product item search page (b) Landing page after clicking search story, which contains both shopping guides and shopping product items Figure 1: An illustrated (not a screenshot) example of search story recommendation.
Training products of expert capsules with mixing by dynamic routing
This study develops an unsupervised learning algorithm for products of expert capsules with dynamic routing. Analogous to binary-valued neurons in Restricted Boltzmann Machines, the magnitude of a squashed capsule firing takes values between zero and one, representing the probability of the capsule being on. This analogy motivates the design of an energy function for capsule networks. In order to have an efficient sampling procedure where hidden layer nodes are not connected, the energy function is made consistent with dynamic routing in the sense of the probability of a capsule firing, and inference on the capsule network is computed with the dynamic routing between capsules procedure. In order to optimize the log-likelihood of the visible layer capsules, the gradient is found in terms of this energy function. The developed unsupervised learning algorithm is used to train a capsule network on standard vision datasets, and is able to generate realistic looking images from its learned distribution.
Probabilistic Approximate Logic and its Implementation in the Logical Imagination Engine
Stehr, Mark-Oliver, Kim, Minyoung, Talcott, Carolyn L., Knapp, Merrill, Vertes, Akos
In spite of the rapidly increasing number of applications of machine learning in various domains, a principled and systematic approach to the incorporation of domain knowledge in the engineering process is still lacking and ad hoc solutions that are difficult to validate are still the norm in practice, which is of growing concern not only in mission-critical applications. In this note, we introduce Probabilistic Approximate Logic (PALO) as a logic based on the notion of mean approximate probability to overcome conceptual and computational difficulties inherent to strictly probabilistic logics. The logic is approximate in several dimensions. Logical independence assumptions are used to obtain approximate probabilities, but by averaging over many instances of formulas a useful estimate of mean probability with known confidence can usually be obtained. To enable efficient computational inference, the logic has a continuous semantics that reflects only a subset of the structural properties of classical logic, but this imprecision can be partly compensated by richer theories obtained by classical inference or other means. Computational inference, which refers to the construction of models and validation of logical properties, is based on Stochastic Gradient Descent (SGD) and Markov Chain Monte Carlo (MCMC) techniques and hence another dimension where approximations are involved. We also present the Logical Imagination Engine (LIME), a prototypical implementation of PALO based on TensorFlow. Albeit not limited to the biological domain, we illustrate its operation in a quite substantial bioinformatics machine learning application concerned with network synthesis and analysis in a recent DARPA project.
Interactive Lungs Auscultation with Reinforcement Learning Agent
Grzywalski, Tomasz, Belluzzo, Riccardo, Drgas, Szymon, Cwalinska, Agnieszka, Hafke-Dys, Honorata
Lung sounds auscultation is the first and most common examination carried out by every general practitioner or family doctor. It is fast, easy and well known procedure, popularized by La ennec (Hy-acinthe, 1819), who invented the stethoscope. Nowadays, different variants of such tool can be found on the market, both analog and electronic, but regardless of the type of stethoscope, this process still is highly subjective. Indeed, an auscultation normally involves the usage of a stethoscope by a physician, thus relying on the examiner's own hearing, experience and ability to interpret psychoacoustical features. Another strong limitation of standard auscultation can be found in the stethoscope itself, since its frequency response tends to attenuate frequency components of the lung sound signal above nearly 120 Hz, leaving lower frequency bands to be analyzed and to which the human ear is not really sensitive (Sovijrvi et al., 2000) (Sarkar et al., 2015).
Automatic crack detection and classification by exploiting statistical event descriptors for Deep Learning
Siracusano, Giulio, La Corte, Aurelio, Tomasello, Riccardo, Lamonaca, Francesco, Scuro, Carmelo, Garescì, Francesca, Carpentieri, Mario, Finocchio, Giovanni
In modern building infrastructures, the chance to devise adaptive and unsupervised data-driven health monitoring systems is gaining in popularity due to the large availability of data from low-cost sensors with internetworking capabilities. In particular, deep learning provides the tools for processing and analyzing this unprecedented amount of data efficiently. The main purpose of this paper is to combine the recent advances of Deep Learning (DL) and statistical analysis on structural health monitoring (SHM) to develop an accurate classification tool able to discriminate among different acoustic emission events (cracks) by means of the identification of tensile, shear and mixed modes. The applications of DL in SHM systems is described by using the concept of Bidirectional Long Short Term Memory. We investigated on effective event descriptors to capture the unique characteristics from the different types of modes. Among them, Spectral Kurtosis and Spectral L2/L1 Norm exhibit distinctive behavior and effectively contributed to the learning process. This classification will contribute to unambiguously detect incipient damages, which is advantageous to realize predictive maintenance. Tests on experimental results confirm that this method achieves accurate classification (92%) capabilities of crack events and can impact on the design of future SHM technologies.
On the relationship between variational inference and adaptive importance sampling
Finke, Axel, Thiery, Alexandre H.
The importance weighted autoencoder (IWAE) (Burda et al., 2016) and reweighted wake-sleep (RWS) algorithm (Bornschein and Bengio, 2015) are popular approaches which employ multiple samples to achieve bias reductions compared to standard variational methods. However, their relationship has hitherto been unclear. We introduce a simple, unified framework for multi-sample variational inference termed adaptive importance sampling for learning (AISLE) and show that it admits IWAE and RWS as special cases. Through a principled application of a variance-reduction technique from Tucker et al. (2019), we also show that the sticking-the-landing (STL) gradient from Roeder et al. (2017), which previously lacked theoretical justification, can be recovered as a special case of RWS (and hence of AISLE). In particular, this indicates that the breakdown of RWS -- but not of STL -- observed in Tucker et al. (2019) may not be attributable to the lack of a joint objective for the generative-model and inference-network parameters as previously conjectured. Finally, we argue that our adaptive-importance-sampling interpretation of variational inference leads to more natural and principled extensions to sequential Monte Carlo methods than the IWAE-type multi-sample objective interpretation.