Goto

Collaborating Authors

 Undirected Networks


Online Machine Learning in Big Data Streams

arXiv.org Machine Learning

The area of online machine learning in big data streams covers algorithms that are (1) distributed and (2) work from data streams with only a limited possibility to store past data. The first requirement mostly concerns software architectures and efficient algorithms. The second one also imposes nontrivial theoretical restrictions on the modeling methods: In the data stream model, older data is no longer available to revise earlier suboptimal modeling decisions as the fresh data arrives. In this article, we provide an overview of distributed software architectures and libraries as well as machine learning models for online learning. We highlight the most important ideas for classification, regression, recommendation, and unsupervised modeling from streaming data, and we show how they are implemented in various distributed data stream processing systems. This article is a reference material and not a survey. We do not attempt to be comprehensive in describing all existing methods and solutions; rather, we give pointers to the most important resources in the field. All related sub-fields, online algorithms, online learning, and distributed data processing are hugely dominant in current research and development with conceptually new research results and software components emerging at the time of writing. In this article, we refer to several survey results, both for distributed data processing and for online machine learning. Compared to past surveys, our article is different because we discuss recommender systems in extended detail.


Masked Conditional Neural Networks for Automatic Sound Events Recognition

arXiv.org Machine Learning

Deep neural network architectures designed for application domains other than sound, especially image recognition, may not optimally harness the time-frequency representation when adapted to the sound recognition problem. In this work, we explore the ConditionaL Neural Network (CLNN) and the Masked ConditionaL Neural Network (MCLNN) for multi-dimensional temporal signal recognition. The CLNN considers the inter-frame relationship, and the MCLNN enforces a systematic sparseness over the network's links to enable learning in frequency bands rather than bins allowing the network to be frequency shift invariant mimicking a filterbank. The mask also allows considering several combinations of features concurrently, which is usually handcrafted through exhaustive manual search. We applied the MCLNN to the environmental sound recognition problem using the ESC-10 and ESC-50 datasets. MCLNN achieved competitive performance, using 12% of the parameters and without augmentation, compared to state-of-the-art Convolutional Neural Networks.


Efficient Hierarchical Robot Motion Planning Under Uncertainty and Hybrid Dynamics

arXiv.org Artificial Intelligence

Noisy observations coupled with nonlinear dynamics pose one of the biggest challenges in robot motion planning. By decomposing the nonlinear dynamics into a discrete set of local dynamics models, hybrid dynamics provide a natural way to model nonlinear dynamics, especially in systems with sudden "jumps" in the dynamics, due to factors such as contacts. We propose a hierarchical POMDP planner that develops locally optimal motion plans for hybrid dynamics models. The hierarchical planner first develops a high-level motion plan to sequence the local dynamics models to be visited. The high-level plan is then converted into a detailed cost-optimized continuous state plan. This hierarchical planning approach results in a decomposition of the POMDP planning problem into smaller sub-parts that can be solved with significantly lower computational costs. The ability to sequence the visitation of local dynamics models also provides a powerful way to leverage the hybrid dynamics to reduce state uncertainty. We evaluate the proposed planner for two navigation and localization tasks in simulated domains, as well as an assembly task with a real robotic manipulator.


Generative Models for Spear Phishing Posts on Social Media

arXiv.org Machine Learning

Historically, machine learning in computer security has prioritized defense: think intrusion detection systems, malware classification, and botnet traffic identification. Offense can benefit from data just as well. Social networks, with their access to extensive personal data, bot-friendly APIs, colloquial syntax, and prevalence of shortened links, are the perfect venues for spreading machine-generated malicious content. We aim to discover what capabilities an adversary might utilize in such a domain. We present a long short-term memory (LSTM) neural network that learns to socially engineer specific users into clicking on deceptive URLs. The model is trained with word vector representations of social media posts, and in order to make a click-through more likely, it is dynamically seeded with topics extracted from the target's timeline. We augment the model with clustering to triage high value targets based on their level of social engagement, and measure success of the LSTM's phishing expedition using click-rates of IP-tracked links. We achieve state of the art success rates, tripling those of historic email attack campaigns, and outperform humans manually performing the same task.


Cognitive Science in the era of Artificial Intelligence: A roadmap for reverse-engineering the infant language-learner

arXiv.org Artificial Intelligence

During their first years of life, infants learn the language(s) of their environment at an amazing speed despite large cross cultural variations in amount and complexity of the available language input. Understanding this simple fact still escapes current cognitive and linguistic theories. Recently, spectacular progress in the engineering science, notably, machine learning and wearable technology, offer the promise of revolutionizing the study of cognitive development. Machine learning offers powerful learning algorithms that can achieve human-like performance on many linguistic tasks. Wearable sensors can capture vast amounts of data, which enable the reconstruction of the sensory experience of infants in their natural environment. The project of 'reverse engineering' language development, i.e., of building an effective system that mimics infant's achievements appears therefore to be within reach. Here, we analyze the conditions under which such a project can contribute to our scientific understanding of early language development. We argue that instead of defining a sub-problem or simplifying the data, computational models should address the full complexity of the learning situation, and take as input the raw sensory signals available to infants. This implies that (1) accessible but privacy-preserving repositories of home data be setup and widely shared, and (2) models be evaluated at different linguistic levels through a benchmark of psycholinguist tests that can be passed by machines and humans alike, (3) linguistically and psychologically plausible learning architectures be scaled up to real data using probabilistic/optimization principles from machine learning. We discuss the feasibility of this approach and present preliminary results.


Crime incidents embedding using restricted Boltzmann machines

arXiv.org Machine Learning

ABSTRACT We present a new approach for detecting related crime series, by unsupervised learning of the latent feature embeddings from narratives of crime record via the Gaussian-Bernoulli Restricted Boltzmann Machine (GBRBM). This is a drastically different approach from prior work on crime analysis, which typically considers only time and location and at most category information. After the embedding, related cases are closer to each other in the Euclidean feature space, and the unrelated cases are far apart, which is a good property can enable subsequent analysis such as detection and clustering of related cases. Experiments over several series of related crime incidents hand labeled by the Atlanta Police Department reveal the promise of our embedding methods. Index Terms-- Unsupervised learning, crime data analysis, feature embeddings, neural networks 1. INTRODUCTION A fundamental and one of the most challenging tasks in crime analysis is to find related crime series [1], which are committed by the same individual or group.


Probabilistic Warnings in National Security Crises: Pearl Harbor Revisited

arXiv.org Artificial Intelligence

Imagine a situation where a group of adversaries is preparing an attack on the United States or U.S. interests. An intelligence analyst has observed some signals, but the situation is rapidly changing. The analyst faces the decision to alert a principal decision maker that an attack is imminent, or to wait until more is known about the situation. This warning decision is based on the analyst's observation and evaluation of signals, independent or correlated, and on her updating of the prior probabilities of possible scenarios and their outcomes. The warning decision also depends on the analyst's assessment of the crisis' dynamics and perception of the preferences of the principal decision maker, as well as the lead time needed for an appropriate response. This article presents a model to support this analyst's dynamic warning decision. As with most problems involving warning, the key is to manage the tradeoffs between false positives and false negatives given the probabilities and the consequences of intelligence failures of both types. The model is illustrated by revisiting the case of the attack on Pearl Harbor in December 1941. It shows that the radio silence of the Japanese fleet carried considerable information (Sir Arthur Conan Doyle's "dog in the night" problem), which was misinterpreted at the time. Even though the probabilities of different attacks were relatively low, their consequences were such that the Bayesian dynamic reasoning described here may have provided valuable information to key decision makers.


Markov Chain Monte Carlo in Python – Towards Data Science

@machinelearnbot

The past few months, I encountered one term again and again in the data science world: Markov Chain Monte Carlo. In my research lab, in podcasts, in articles, every time I heard the phrase I would nod and think that sounds pretty cool with only a vague idea of what anyone was talking about. Several times I tried to learn MCMC and Bayesian inference, but every time I started reading the books, I soon gave up. Exasperated, I turned to the best method to learn any new skill: apply it to a problem. Using some of my sleep data I had been meaning to explore and a hands-on application-based book (Bayesian Methods for Hackers, available free online), I finally learned Markov Chain Monte Carlo through a real-world project.


Why is machine learning in finance so hard?

#artificialintelligence

Financial markets have been one of the earliest adopters of machine learning (ML). People have been using ML to spot patterns in the markets since 1980s. Even though ML has had enormous successes in predicting the market outcomes in the past, the recent advances in deep learning haven't helped financial market predictions much. While deep learning and other ML techniques have finally made it possible for Alexa, Google Assistant and Google Photos to work, there hasn't been much progress when it comes to stock markets. I am not a researcher.


Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning

arXiv.org Machine Learning

We introduce SCAL, an algorithm designed to perform efficient exploration-exploitation in any unknown weakly-communicating Markov Decision Process (MDP) for which an upper bound c on the span of the optimal bias function is known. For an MDP with S states, A actions and Gamma <= S possible next states, we prove a regret bound of O(c\sqrt{Gamma SAT}), which significantly improves over existing algorithms (e.g., UCRL and PSRL), whose regret scales linearly with the MDP diameter D. In fact, the optimal bias span is finite and often much smaller than D (e.g., D=infinity in non-communicating MDPs). A similar result was originally derived by Bartlett and Tewari (2009) for REGAL.C, for which no tractable algorithm is available. In this paper, we relax the optimization problem at the core of REGAL.C, we carefully analyze its properties, and we provide the first computationally efficient algorithm to solve it. Finally, we report numerical simulations supporting our theoretical findings and showing how SCAL significantly outperforms UCRL in MDPs with large diameter and small span.