Goto

Collaborating Authors

 Learning Graphical Models


A Methodology for Customizing Clinical Tests for Esophageal Cancer based on Patient Preferences

arXiv.org Machine Learning

Tests for Esophageal cancer can be expensive, uncomfortable and can have side effects. For many patients, we can predict non-existence of disease with 100% certainty, just using demographics, lifestyle, and medical history information. Our objective is to devise a general methodology for customizing tests using user preferences so that expensive or uncomfortable tests can be avoided. We propose to use classifiers trained from electronic health records (EHR) for selection of tests. The key idea is to design classifiers with 100% false normal rates, possibly at the cost higher false abnormals. We compare Naive Bayes classification (NB), Random Forests (RF), Support Vector Machines (SVM) and Logistic Regression (LR), and find kernel Logistic regression to be most suitable for the task. We propose an algorithm for finding the best probability threshold for kernel LR, based on test set accuracy. Using the proposed algorithm, we describe schemes for selecting tests, which appear as features in the automatic classification algorithm, using preferences on costs and discomfort of the users. We test our methodology with EHRs collected for more than 3000 patients, as a part of project carried out by a reputed hospital in Mumbai, India. Kernel SVM and kernel LR with a polynomial kernel of degree 3, yields an accuracy of 99.8% and sensitivity 100%, without the MP features, i.e. using only clinical tests. We demonstrate our test selection algorithm using two case studies, one using cost of clinical tests, and other using "discomfort" values for clinical tests. We compute the test sets corresponding to the lowest false abnormals for each criterion described above, using exhaustive enumeration of 15 clinical tests. The sets turn out to different, substantiating our claim that one can customize test sets based on user preferences.


Learning Protein Dynamics with Metastable Switching Systems

arXiv.org Machine Learning

We introduce a machine learning approach for extracting fine-grained representations of protein evolution from molecular dynamics datasets. Metastable switching linear dynamical systems extend standard switching models with a physically-inspired stability constraint. This constraint enables the learning of nuanced representations of protein dynamics that closely match physical reality. We derive an EM algorithm for learning, where the E-step extends the forward-backward algorithm for HMMs and the M-step requires the solution of large biconvex optimization problems. We construct an approximate semidefinite program solver based on the Frank-Wolfe algorithm and use it to solve the M-step. We apply our EM algorithm to learn accurate dynamics from large simulation datasets for the opioid peptide met-enkephalin and the proto-oncogene Src-kinase. Our learned models demonstrate significant improvements in temporal coherence over HMMs and standard switching models for met-enkephalin, and sample transition paths (possibly useful in rational drug design) for Src-kinase.


A Theoretically Grounded Application of Dropout in Recurrent Neural Networks

arXiv.org Machine Learning

Recurrent neural networks (RNNs) stand at the forefront of many recent developments in deep learning. Yet a major difficulty with these models is their tendency to overfit, with dropout shown to fail when applied to recurrent layers. Recent results at the intersection of Bayesian modelling and deep learning offer a Bayesian interpretation of common deep learning techniques such as dropout. This grounding of dropout in approximate Bayesian inference suggests an extension of the theoretical results, offering insights into the use of dropout with RNN models. We apply this new variational inference based dropout technique in LSTM and GRU models, assessing it on language modelling and sentiment analysis tasks. The new approach outperforms existing techniques, and to the best of our knowledge improves on the single model state-of-the-art in language modelling with the Penn Treebank (73.4 test perplexity). This extends our arsenal of variational tools in deep learning.


Modeling State-Conditional Observation Distribution using Weighted Stereo Samples for Factorial Speech Processing Models

arXiv.org Artificial Intelligence

This paper investigates the effectiveness of factorial speech processing models in noise-robust automatic speech recognition tasks. For this purpose, the paper proposes an idealistic approach for modeling state-conditional observation distribution of factorial models based on weighted stereo samples. This approach is an extension to previous single pass retraining for ideal model compensation which is extended here to support multiple audio sources. Non-stationary noises can be considered as one of these audio sources with multiple states. Experiments of this paper over the set A of the Aurora 2 dataset show that recognition performance can be improved by this consideration. The improvement is significant in low signal to noise energy conditions, up to 4% absolute word recognition accuracy. In addition to the power of the proposed method in accurate representation of state-conditional observation distribution, it has an important advantage over previous methods by providing the opportunity to independently select feature spaces for both source and corrupted features. This opens a new window for seeking better feature spaces appropriate for noisy speech, independent from clean speech features.


Sarcasm Detection with Machine Learning in Spark

#artificialintelligence

This post is inspired by a site I found whilst searching for a way to detect sarcasm within sentences. As humans we sometimes struggle detecting sarcasm when we have a lot more contextual information available to us. People are emotive when they speak, they use certain tones and these traits can help us understand when someone is being sarcastic. However we don't always catch it! So how the hell could a computer detect this, when all it has is text.


Conditional Random Fields (CRF): Short Survey

@machinelearnbot

Currently, many of us are overwhelmed with mighty power of Deep Learning. We start to forget about humble graphical models. CRF is not so trendy as LSTM, but it is robust, reliable and worth noting. In this post, you will find a short summary about CRF (aka Conditional Random Fields) – what is this thing, what is it for and some interesting facts. In practical implementation, the computational time is often larger due to many other operations like numerical scaling, smoothing etc.


MimicA: A Framework for Self-Learning Companion AI Behavior

AAAI Conferences

We explore fully autonomous companion characters within the context of Real Time Strategy games. Non-player Characters that are controlled by Artificial Intelligence to some degree, have been a feature of Role Playing games for decades. But RTS games rarely have a player avatar, and thus no real companions. The universe of RTS games where both an avatar and a companion character exist is small. Most friendly RTS units are semi-autonomous at best, requiring player micromanagement of their behavior. We present MimicA, a real-time framework to govern AI companion behavior by modeling that of the current player. Built for the Unity engine, MimicA is a learn-by-demonstration framework that differs from existing practices in that the behavior is fully autonomous, does not rely on previous modeling exercises and is designed to be generalized and extensible. We analyze and discuss MimicA through a thirty person user study with our own demonstration game, Lord of Towers. We find that 22 out of 30 participants (73%) indicate they enjoyed the game, and this self-reported enjoyment was on par with “traditional tower defense games”. 63% agree that MimicA controlled NPCs are doing what the player would do while 20% disagree. Similarly, 53% realize the NPCs are learning from the player while 20% do not. We also show that NPC with underlying Decision Tree and Naive Bayes algorithms are better than KNN in making the player realize the learning nature of the NPC.


A Generalized Multidimensional Evaluation Framework for Player Goal Recognition

AAAI Conferences

Recent years have seen a growing interest in player modeling, which supports the creation of player-adaptive digital games. A central problem of player modeling is goal recognition, which aims to recognize players’ intentions from observable gameplay behaviors. Player goal recognition offers the promise of enabling games to dynamically adjust challenge levels, perform procedural content generation, and create believable NPC interactions. A growing body of work is investigating a wide range of machine learning-based goal recognition models. In this paper, we introduce GOALIE, a multidimensional framework for evaluating player goal recognition models. The framework integrates multiple metrics for player goal recognition models, including two novel metrics, n-early convergence rate and standardized convergence point . We demonstrate the application of the GOALIE framework with the evaluation of several player goal recognition models, including Markov logic network-based, deep feedforward neural network-based, and long short-term memory network-based goal recognizers on two different educational games. The results suggest that GOALIE effectively captures goal recognition behaviors that are key to next-generation player modeling.


Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning

arXiv.org Machine Learning

Deep learning tools have gained tremendous attention in applied machine learning. However such tools for regression and classification do not capture model uncertainty. In comparison, Bayesian models offer a mathematically grounded framework to reason about model uncertainty, but usually come with a prohibitive computational cost. In this paper we develop a new theoretical framework casting dropout training in deep neural networks (NNs) as approximate Bayesian inference in deep Gaussian processes. A direct result of this theory gives us tools to model uncertainty with dropout NNs -- extracting information from existing models that has been thrown away so far. This mitigates the problem of representing uncertainty in deep learning without sacrificing either computational complexity or test accuracy. We perform an extensive study of the properties of dropout's uncertainty. Various network architectures and non-linearities are assessed on tasks of regression and classification, using MNIST as an example. We show a considerable improvement in predictive log-likelihood and RMSE compared to existing state-of-the-art methods, and finish by using dropout's uncertainty in deep reinforcement learning.


Hidden Markov Models for Regime Detection using R - QuantStart

#artificialintelligence

In the previous article in the series Hidden Markov Models were introduced. They were discussed in the context of the broader class of Markov Models. They were motivated by the need for quantitative traders to have the ability to detect market regimes in order to adjust how their quant strategies are managed. In particular it was mentioned that "various regimes lead to adjustments of asset returns via shifts in their means, variances/volatilities, serial correlation and covariances, which impact the effectiveness of time series methods that rely on stationarity". This has a significant bearing on how trading strategies are modified throughout the strategy lifecycle.