Multivariate Anomaly Detection in Medicare using Model Residuals and Probabilistic Programming

AAAI Conferences

Anomalies in healthcare claims data can be indicative of possible fraudulent activities, contributing to a significant portion of overall healthcare costs. Medicare is a large government run healthcare program that serves the needs of the elderly in the United States. The increasing elderly population and their reliance on the Medicare program create an environment with rising costs and increased risk of fraud. The detection of these potentially fraudulent activities can recover costs and lessen the overall impact of fraud on the Medicare program. In this paper, we propose a new method to detect fraud by discovering outliers, or anomalies, in payments made to Medicare providers. We employ a multivariate outlier detection method split into two parts. In the first part, we create a multivariate regression model and generate corresponding residuals. In the second part, these residuals are used as inputs into a generalizable univariate probability model. We create this Bayesian probability model using probabilistic programming. Our results indicate our model is robust and less dependent on underlying data distributions, versus Mahalanobis distance. Moreover, we are able to demonstrate successful anomaly detection, within Medicare specialties, providing meaningful results for further investigation.

Bandit Learning Through Biased Maximum Likelihood Estimation Machine Learning

We propose BMLE, a new family of bandit algorithms, that are formulated in a general way based on the Biased Maximum Likelihood Estimation method originally appearing in the adaptive control literature. We design the cost-bias term to tackle the exploration and exploitation tradeoff for stochastic bandit problems. We provide an explicit closed form expression for the index of an arm for Bernoulli bandits, which is trivial to compute. We also provide a general recipe for extending the BMLE algorithm to other families of reward distributions. We prove that for Bernoulli bandits, the BMLE algorithm achieves a logarithmic finite-time regret bound and hence attains order-optimality. Through extensive simulations, we demonstrate that the proposed algorithms achieve regret performance comparable to the best of several state-of-the-art baseline methods, while having a significant computational advantage in comparison to other best performing methods. The generality of the proposed approach makes it possible to address more complex models, including general adaptive control of Markovian systems.

Discovering Emerging Topics in Social Streams via Link Anomaly Detection Machine Learning

Detection of emerging topics are now receiving renewed interest motivated by the rapid growth of social networks. Conventional term-frequency-based approaches may not be appropriate in this context, because the information exchanged are not only texts but also images, URLs, and videos. We focus on the social aspects of theses networks. That is, the links between users that are generated dynamically intentionally or unintentionally through replies, mentions, and retweets. We propose a probability model of the mentioning behaviour of a social network user, and propose to detect the emergence of a new topic from the anomaly measured through the model. We combine the proposed mention anomaly score with a recently proposed change-point detection technique based on the Sequentially Discounting Normalized Maximum Likelihood (SDNML), or with Kleinberg's burst model. Aggregating anomaly scores from hundreds of users, we show that we can detect emerging topics only based on the reply/mention relationships in social network posts. We demonstrate our technique in a number of real data sets we gathered from Twitter. The experiments show that the proposed mention-anomaly-based approaches can detect new topics at least as early as the conventional term-frequency-based approach, and sometimes much earlier when the keyword is ill-defined.

Thompson sampling with the online bootstrap Machine Learning

Thompson sampling provides a solution to bandit problems in which new observations are allocated to arms with the posterior probability that an arm is optimal. While sometimes easy to implement and asymptotically optimal, Thompson sampling can be computationally demanding in large scale bandit problems, and its performance is dependent on the model fit to the observed data. We introduce bootstrap Thompson sampling (BTS), a heuristic method for solving bandit problems which modifies Thompson sampling by replacing the posterior distribution used in Thompson sampling by a bootstrap distribution. We first explain BTS and show that the performance of BTS is competitive to Thompson sampling in the well-studied Bernoulli bandit case. Subsequently, we detail why BTS using the online bootstrap is more scalable than regular Thompson sampling, and we show through simulation that BTS is more robust to a misspecified error distribution. BTS is an appealing modification of Thompson sampling, especially when samples from the posterior are otherwise not available or are costly.