Goto

Collaborating Authors

 Undirected Networks


Deep Learning For Sequential Data โ€“ Part II: Constraints Of Traditional Approaches

#artificialintelligence

In the previous blog post, we discussed the nature of sequential data and why we need a robust separate modeling technique to analyze that data. Traditionally, people have been using Hidden Markov Models (HMMs) to analyze sequential data, so we will center the discussion around HMMs in this blog post. HMMs have been implemented for many tasks such as speech recognition, gesture recognition, part-of-speech tagging, and so on. But HMMs place a lot of restrictions as to how we can model our data. HMMs are definitely better than using classical machine learning techniques, but they don't fully cover the needs of all the modern data analysis.


Learning theory estimates with observations from general stationary stochastic processes

arXiv.org Machine Learning

This paper investigates the supervised learning problem with observations drawn from certain general stationary stochastic processes. Here by \emph{general}, we mean that many stationary stochastic processes can be included. We show that when the stochastic processes satisfy a generalized Bernstein-type inequality, a unified treatment on analyzing the learning schemes with various mixing processes can be conducted and a sharp oracle inequality for generic regularized empirical risk minimization schemes can be established. The obtained oracle inequality is then applied to derive convergence rates for several learning schemes such as empirical risk minimization (ERM), least squares support vector machines (LS-SVMs) using given generic kernels, and SVMs using Gaussian kernels for both least squares and quantile regression. It turns out that for i.i.d.~processes, our learning rates for ERM recover the optimal rates. On the other hand, for non-i.i.d.~processes including geometrically $\alpha$-mixing Markov processes, geometrically $\alpha$-mixing processes with restricted decay, $\phi$-mixing processes, and (time-reversed) geometrically $\mathcal{C}$-mixing processes, our learning rates for SVMs with Gaussian kernels match, up to some arbitrarily small extra term in the exponent, the optimal rates. For the remaining cases, our rates are at least close to the optimal rates. As a by-product, the assumed generalized Bernstein-type inequality also provides an interpretation of the so-called "effective number of observations" for various mixing processes.


What is Human-Centred Machine Learning

#artificialintelligence

This sunday we are running a workshop at ACM CHI 2016 called "Human Centered Machine Learning". I thought I would write an article to explain the general idea (though the workshop itself is a way of better understanding the idea). Statistical Machine Learning is one of the most successful set of techniques to come out of Computer Science in the last decades, and one that a lot of people are thinking about at the moment. It's often presented as quite an impersonal process: machines that learn for themselves, even AI that risk taking over the world. But, in fact, there is a lot of human work that goes into machine learning and not enough people have been talking about that.


The 7 Best Data Science and Machine Learning Podcasts -- The Startup

#artificialintelligence

Data science and machine learning have long been interests of mine, but now that I'm working on Fuzzy.io I need to keep on top of all the news in both fields. My preferred way to do this is through listening to podcasts. I've listened to a bunch of machine learning and data science podcasts in the last few months, so I thought I'd share my favorites: Every other week, they release a 10โ€“15 minute episode where hosts, Kyle and Linda Polich give a short primer on topics like k-means clustering, natural language processing and decision tree learning, often using analogies related to their pet parrot, Yoshi. This is the only place where you'll learn about k-means clustering via placement of parrot droppings.


Learning Continuous State/Action Models for Humanoid Robots

AAAI Conferences

Reinforcement learning (RL) is a popular choice for solving robotic control problems. However, applying RL techniques to controlling humanoid robots with high degrees of freedom remains problematic due to the difficulty of acquiring sufficient training data. The problem is compounded by the fact that most real-world problems involve continuous states and actions. In order for RL to be scalable to these situations it is crucial that the algorithm be sample efficient. Model-based methods tend to be more data efficient than model-free approaches and have the added advantage that a single model can generalize to multiple control problems. This paper proposes a model approximation algorithm for continuous states and actions that integrates case-based reasoning (CBR) and Hidden Markov Models (HMM) to generalize from a small set of state instances. The paper demonstrates that the performance of the learned model is close to that of the system dynamics it approximates, where performance is measured in terms of sampling error.


Propositionalization for Unsupervised Outlier Detection in Multi-Relational Data

AAAI Conferences

We develop a novel propositionalization approach to unsupervised outlier detection for multi-relational data. Propositionalization summarizes the information from multi-relational data, that are typically stored in multiple tables, in a single data table. The columns in the data table represent conjunctive relational features that are learned from the data. An advantage of propositionalization is that it facilitates applying the many previous outlier detection methods that were designed for single-table data. We show that conjunctive features for outlier detection can be learned from data using statistical-relational methods. Specifically, we apply Markov Logic Network structure learning. Compared to baseline propositionalization methods, Markov Logic propositionalization produces the most compact data tables, whose attributes capture the most complex multi-relational correlations. We apply three representative outlier detection methods LOF, KNN, OutRank to the data tables constructed by propositionalization.


Google RankBrain Algorithm in Digital Marketing

#artificialintelligence

One is going to give a historical overview about GoogleBrain and analyse the pattern, then we will conculde our finding about the current sitation and future changes in search engine algorithm. Back in 2006 there were some interests in implementing artificial intelligence in Google search engine algorithm. A few years later in 2014, GoogleBrain was established after acquisition of DeepMind, a British artificial intelligence company which was founded in 2010. They worked on how to play video games based on machine learning and artificial neural networks (ANNs). The smart artificial intelligence revolution can recognize patterns in digital representations of sounds, images and data.


Energy Disaggregation for Real-Time Building Flexibility Detection

arXiv.org Machine Learning

Energy is a limited resource which has to be managed wisely, taking into account both supply-demand matching and capacity constraints in the distribution grid. One aspect of the smart energy management at the building level is given by the problem of real-time detection of flexible demand available. In this paper we propose the use of energy disaggregation techniques to perform this task. Firstly, we investigate the use of existing classification methods to perform energy disaggregation. A comparison is performed between four classifiers, namely Naive Bayes, k-Nearest Neighbors, Support Vector Machine and AdaBoost. Secondly, we propose the use of Restricted Boltzmann Machine to automatically perform feature extraction. The extracted features are then used as inputs to the four classifiers and consequently shown to improve their accuracy. The efficiency of our approach is demonstrated on a real database consisting of detailed appliance-level measurements with high temporal resolution, which has been used for energy disaggregation in previous studies, namely the REDD. The results show robustness and good generalization capabilities to newly presented buildings with at least 96% accuracy.


Market forecasting using Hidden Markov Models

arXiv.org Machine Learning

Working on the daily closing prices and logreturns, in this paper we deal with the use of Hidden Markov Models (HMMs) to forecast the price of the EUR/USD Futures. The aim of our work is to understand how the HMMs describe different financial time series depending on their structure. Subsequently, we analyse the forecasting methods exposed in the previous literature, putting on evidence their pros and cons.


Directional Statistics in Machine Learning: a Brief Review

arXiv.org Machine Learning

The modern data analyst must cope with data encoded in various forms, vectors, matrices, strings, graphs, or more. Consequently, statistical and machine learning models tailored to different data encodings are important. We focus on data encoded as normalized vectors, so that their "direction" is more important than their magnitude. Specifically, we consider high-dimensional vectors that lie either on the surface of the unit hypersphere or on the real projective plane. For such data, we briefly review common mathematical models prevalent in machine learning, while also outlining some technical aspects, software, applications, and open mathematical challenges.