On the Prior Sensitivity of Thompson Sampling

arXiv.org Machine Learning

The empirically successful Thompson Sampling algorithm for stochastic bandits has drawn much interest in understanding its theoretical properties. One important benefit of the algorithm is that it allows domain knowledge to be conveniently encoded as a prior distribution to balance exploration and exploitation more effectively. While it is generally believed that the algorithm's regret is low (high) when the prior is good (bad), little is known about the exact dependence. In this paper, we fully characterize the algorithm's worst-case dependence of regret on the choice of prior, focusing on a special yet representative case. These results also provide insights into the general sensitivity of the algorithm to the choice of priors. In particular, with $p$ being the prior probability mass of the true reward-generating model, we prove $O(\sqrt{T/p})$ and $O(\sqrt{(1-p)T})$ regret upper bounds for the bad- and good-prior cases, respectively, as well as \emph{matching} lower bounds. Our proofs rely on the discovery of a fundamental property of Thompson Sampling and make heavy use of martingale theory, both of which appear novel in the literature, to the best of our knowledge.

Discovering Emerging Topics in Social Streams via Link Anomaly Detection

arXiv.org Machine Learning

Detection of emerging topics are now receiving renewed interest motivated by the rapid growth of social networks. Conventional term-frequency-based approaches may not be appropriate in this context, because the information exchanged are not only texts but also images, URLs, and videos. We focus on the social aspects of theses networks. That is, the links between users that are generated dynamically intentionally or unintentionally through replies, mentions, and retweets. We propose a probability model of the mentioning behaviour of a social network user, and propose to detect the emergence of a new topic from the anomaly measured through the model. We combine the proposed mention anomaly score with a recently proposed change-point detection technique based on the Sequentially Discounting Normalized Maximum Likelihood (SDNML), or with Kleinberg's burst model. Aggregating anomaly scores from hundreds of users, we show that we can detect emerging topics only based on the reply/mention relationships in social network posts. We demonstrate our technique in a number of real data sets we gathered from Twitter. The experiments show that the proposed mention-anomaly-based approaches can detect new topics at least as early as the conventional term-frequency-based approach, and sometimes much earlier when the keyword is ill-defined.

Intrusion Detection using Continuous Time Bayesian Networks

Journal of Artificial Intelligence Research

Intrusion detection systems (IDSs) fall into two high-level categories: network-based systems (NIDS) that monitor network behaviors, and host-based systems (HIDS) that monitor system calls. In this work, we present a general technique for both systems. We use anomaly detection, which identifies patterns not conforming to a historic norm. In both types of systems, the rates of change vary dramatically over time (due to burstiness) and over components (due to service difference). To efficiently model such systems, we use continuous time Bayesian networks (CTBNs) and avoid specifying a fixed update interval common to discrete-time models.

Multivariate Anomaly Detection in Medicare using Model Residuals and Probabilistic Programming

AAAI Conferences

Anomalies in healthcare claims data can be indicative of possible fraudulent activities, contributing to a significant portion of overall healthcare costs. Medicare is a large government run healthcare program that serves the needs of the elderly in the United States. The increasing elderly population and their reliance on the Medicare program create an environment with rising costs and increased risk of fraud. The detection of these potentially fraudulent activities can recover costs and lessen the overall impact of fraud on the Medicare program. In this paper, we propose a new method to detect fraud by discovering outliers, or anomalies, in payments made to Medicare providers. We employ a multivariate outlier detection method split into two parts. In the first part, we create a multivariate regression model and generate corresponding residuals. In the second part, these residuals are used as inputs into a generalizable univariate probability model. We create this Bayesian probability model using probabilistic programming. Our results indicate our model is robust and less dependent on underlying data distributions, versus Mahalanobis distance. Moreover, we are able to demonstrate successful anomaly detection, within Medicare specialties, providing meaningful results for further investigation.

Bandit Learning Through Biased Maximum Likelihood Estimation

arXiv.org Machine Learning

We propose BMLE, a new family of bandit algorithms, that are formulated in a general way based on the Biased Maximum Likelihood Estimation method originally appearing in the adaptive control literature. We design the cost-bias term to tackle the exploration and exploitation tradeoff for stochastic bandit problems. We provide an explicit closed form expression for the index of an arm for Bernoulli bandits, which is trivial to compute. We also provide a general recipe for extending the BMLE algorithm to other families of reward distributions. We prove that for Bernoulli bandits, the BMLE algorithm achieves a logarithmic finite-time regret bound and hence attains order-optimality. Through extensive simulations, we demonstrate that the proposed algorithms achieve regret performance comparable to the best of several state-of-the-art baseline methods, while having a significant computational advantage in comparison to other best performing methods. The generality of the proposed approach makes it possible to address more complex models, including general adaptive control of Markovian systems.