If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."
However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …
We propose DenseHMM - a modification of Hidden Markov Models (HMMs) that allows to learn dense representations of both the hidden states and the observables. Compared to the standard HMM, transition probabilities are not atomic but composed of these representations via kernelization. Our approach enables constraint-free and gradient-based optimization. We propose two optimization schemes that make use of this: a modification of the Baum-Welch algorithm and a direct co-occurrence optimization. The latter one is highly scalable and comes empirically without loss of performance compared to standard HMMs. We show that the non-linearity of the kernelization is crucial for the expressiveness of the representations. The properties of the DenseHMM like learned co-occurrences and log-likelihoods are studied empirically on synthetic and biomedical datasets.
Finite mixture models have been used for unsupervised learning for some time, and their use within the semi-supervised paradigm is becoming more commonplace. Clickstream data is one of the various emerging data types that demands particular attention because there is a notable paucity of statistical learning approaches currently available. A mixture of first-order continuous time Markov models is introduced for unsupervised and semi-supervised learning of clickstream data. This approach assumes continuous time, which distinguishes it from existing mixture model-based approaches; practically, this allows account to be taken of the amount of time each user spends on each webpage. The approach is evaluated, and compared to the discrete time approach, using simulated and real data.
Created by Lazy Programmer Inc. English [Auto-generated], Indonesian [Auto-generated], 5 more Created by Lazy Programmer Inc. Like the course I just released on Hidden Markov Models, Recurrent Neural Networks are all about learning sequences - but whereas Markov Models are limited by the Markov assumption, Recurrent Neural Networks are not - and as a result, they are more expressive, and more powerful than anything we've seen on tasks that we haven't made progress on in decades. So what's going to be in this course and how will it build on the previous neural network courses and Hidden Markov Models? In the first section of the course we are going to add the concept of time to our neural networks. I'll introduce you to the Simple Recurrent Unit, also known as the Elman unit.
In a real life process evolving over time, the relationship between its relevant variables may change. Therefore, it is advantageous to have different inference models for each state of the process. Asymmetric hidden Markov models fulfil this dynamical requirement and provide a framework where the trend of the process can be expressed as a latent variable. In this paper, we modify these recent asymmetric hidden Markov models to have an asymmetric autoregressive component, allowing the model to choose the order of autoregression that maximizes its penalized likelihood for a given training set. Additionally, we show how inference, hidden states decoding and parameter learning must be adapted to fit the proposed model. Finally, we run experiments with synthetic and real data to show the capabilities of this new model.
Tracking a car or a person in a city is crucial for urban safety management. How can we complete the task with minimal number of spatiotemporal searches from massive camera records? This paper proposes a strategy named IHMs (Intermediate Searching at Heuristic Moments): each step we figure out which moment is the best to search according to a heuristic indicator, then at that moment search locations one by one in descending order of predicted appearing probabilities, until a search hits; iterate this step until we get the object's current location. Five searching strategies are compared in experiments, and IHMs is validated to be most efficient, which can save up to 1/3 total costs. This result provides an evidence that "searching at intermediate moments can save cost".
Hidden Markov models (HMMs) are commonly used for sequential data modeling when the true state of the system is not fully known. We formulate a special case of recurrent neural networks (RNNs), which we name hidden Markov recurrent neural networks (HMRNNs), and prove that each HMRNN has the same likelihood function as a corresponding discrete-observation HMM. We experimentally validate this theoretical result on synthetic datasets by showing that parameter estimates from HMRNNs are numerically close to those obtained from HMMs via the Baum-Welch algorithm. We demonstrate our method's utility in a case study on Alzheimer's disease progression, in which we augment HMRNNs with other predictive neural networks. The augmented HMRNN yields parameter estimates that offer a novel clinical interpretation and fit the patient data better than HMM parameter estimates from the Baum-Welch algorithm.
One of the most common complaints I hear from students is: Why do I have to learn all this math? Why isn't there a library to do what I want? Someone recently made this proclamation to me: "You should explain that your courses are for college students, not industry professionals". This made me laugh very hard. In this article, I will refer to students who make such proclamations as "ML wannabes" for lack of a better term, because people who actually do ML generally know better than this.
We study the problem of identifying viewers of arbitrary images based on their eye gaze. Psychological research has derived generative stochastic models of eye movements. In order to exploit this background knowledge within a discriminatively trained classification model, we derive Fisher kernels from different generative models of eye gaze. Experimentally, we find that the performance of the classifier strongly depends on the underlying generative model. Using an SVM with Fisher kernel improves the classification performance over the underlying generative model.
Next location prediction is of great importance for many location-based applications and provides essential intelligence to business and governments. In existing studies, a common approach to next location prediction is to learn the sequential transitions with massive historical trajectories based on conditional probability. Unfortunately, due to the time and space complexity, these methods (e.g., Markov models) only use the just passed locations to predict next locations, without considering all the passed locations in the trajectory. In this paper, we seek to enhance the prediction performance by considering the travel time from all the passed locations in the query trajectory to a candidate next location. In particular, we propose a novel method, called Travel Time Difference Model (TTDM), which exploits the difference between the shortest travel time and the actual travel time to predict next locations. Further, we integrate the TTDM with a Markov model via a linear interpolation to yield a joint model, which computes the probability of reaching each possible next location and returns the top-rankings as results. We have conducted extensive experiments on two real datasets: the vehicle passage record (VPR) data and the taxi trajectory data. The experimental results demonstrate significant improvements in prediction accuracy over existing solutions. For example, compared with the Markov model, the top-1 accuracy improves by 40% on the VPR data and by 15.6% on the Taxi data.
In this paper, we solve the problem of predicting the next locations of the moving objects with a historical dataset of trajectories. We present a Next Location Predictor with Markov Modeling (NLPMM) which has the following advantages: (1) it considers both individual and collective movement patterns in making prediction, (2) it is effective even when the trajectory data is sparse, (3) it considers the time factor and builds models that are suited to different time periods. We have conducted extensive experiments in a real dataset, and the results demonstrate the superiority of NLPMM over existing methods.