Madani, Omid
Tracking Changing Probabilities via Dynamic Learners
Madani, Omid
Consider a predictor, a learner, whose input is a stream of discrete items. The predictor's task, at every time point, is probabilistic multiclass prediction, i.e., to predict which item may occur next by outputting zero or more candidate items, each with a probability, after which the actual item is revealed and the predictor learns from this observation. To output probabilities, the predictor keeps track of the proportions of the items it has seen. The predictor has constant (limited) space and we seek efficient prediction and update techniques: The stream is unbounded, the set of items is unknown to the predictor and their totality can also grow unbounded. Moreover, there is non-stationarity: the underlying frequencies of items may change, substantially, from time to time. For instance, new items may start appearing and a few currently frequent items may cease to occur again. The predictor, being space-bounded, need only provide probabilities for those items with (currently) sufficiently high frequency, i.e., the salient items. This problem is motivated in the setting of prediction games, a self-supervised learning regime where concepts serve as both the predictors and the predictands, and the set of concepts grows over time, resulting in non-stationarities as new concepts are generated and used. We develop moving average techniques designed to respond to such non-stationarities in a timely manner, and explore their properties. One is a simple technique based on queuing of count snapshots, and another is a combination of queuing together with an extended version of sparse EMA. The latter combination supports predictand-specific dynamic learning rates. We find that this flexibility allows for a more accurate and timely convergence.
Binomial Tails for Community Analysis
Madani, Omid, Ngo, Thanh, Zeng, Weifei, Averine, Sai Ankith, Evuru, Sasidhar, Malhotra, Varun, Gandham, Shashidhar, Yadav, Navindra
Automated discovery of candidate communities in networks finds a variety of applications in physical and social sciences (biological and biochemical networks, physical and virtual human networks) [1, 2]. Given a graph representing binary relations among nodes, informally and intuitively, a community corresponds to a subgraph, i.e. a subset of nodes, with relatively high edge density among the community members (nodes of the subgraph), and comparatively lower density of edges going outside the community. Defining communities more precisely and what overall community structure may be in various domains, and design of efficient robust algorithms for uncovering such in networks has been the subject of much research [1, 3]. In our use-case, we are interested in the automated discovery and effective presentation of candidate communities comprised of computers (hosts) in an enterprise network. In particular this effort is a component of a tool that provides a user, such as a security administrator of an organization, visibility into their complex network, and importantly helps the user partition the network into groups corresponding to geographic partitions, different departments, and hosts running different applications in the organization. This partitioning and naming of the groups is a necessary step in defining and maintaining network security policies, aka network segmentation: hosts in different groups (segments) can only communicate on a few well-defined and restricted channels. Such policy enforcement severely limits penetration and spread of malware and hackers. This step of grouping hosts and assigning meaningful names/labels to the groups, with the human in the loop, is also highly useful in generating insights, for example in uncovering broad patterns of communications with applications not just for security but also for network optimization.
Reports of the 2013 AAAI Spring Symposium Series
Markman, Vita (Disney Interactive Studios) | Stojanov, Georgi (American University of Paris) | Indurkhya, Bipin (International Institute of Information Technology) | Kido, Takashi (Rikengenesis) | Takadama, Keiki (University of Electro-Communications) | Konidaris, George (Massachusetts Institute of Technology) | Eaton, Eric (Bryn Mawr College) | Matsumura, Naohiro (Osaka University) | Fruchter, Renate (Stanford University) | Sofge, Donald (Naval Research Laboratory) | Lawless, William (Paine College) | Madani, Omid (Google) | Sukthankaris, Rahul (Google)
The Association for the Advancement of Artificial Intelligence was pleased to present the AAAI 2013 Spring Symposium Series, held Monday through Wednesday, March 25-27, 2013. The titles of the eight symposia were Analyzing Microtext, Creativity and (Early) Cognitive Development, Data Driven Wellness: From Self-Tracking to Behavior Change, Designing Intelligent Robots: Reintegrating AI II, Lifelong Machine Learning, Shikakeology: Designing Triggers for Behavior Change, Trust and Autonomous Systems, and Weakly Supervised Learning from Multimedia. This report contains summaries of the symposia, written, in most cases, by the cochairs of the symposium.
Reports of the 2013 AAAI Spring Symposium Series
Markman, Vita (Disney Interactive Studios) | Stojanov, Georgi (American University of Paris) | Indurkhya, Bipin (International Institute of Information Technology) | Kido, Takashi (Rikengenesis) | Takadama, Keiki (University of Electro-Communications) | Konidaris, George (Massachusetts Institute of Technology) | Eaton, Eric (Bryn Mawr College) | Matsumura, Naohiro (Osaka University) | Fruchter, Renate (Stanford University) | Sofge, Donald (Naval Research Laboratory) | Lawless, William (Paine College) | Madani, Omid (Google) | Sukthankaris, Rahul (Google)
The Association for the Advancement of Artificial Intelligence was pleased to present the AAAI 2013 Spring Symposium Series, held Monday through Wednesday, March 25-27, 2013. The titles of the eight symposia were Analyzing Microtext, Creativity and (Early) Cognitive Development, Data Driven Wellness: From Self-Tracking to Behavior Change, Designing Intelligent Robots: Reintegrating AI II, Lifelong Machine Learning, Shikakeology: Designing Triggers for Behavior Change, Trust and Autonomous Systems, and Weakly Supervised Learning from Multimedia. This report contains summaries of the symposia, written, in most cases, by the cochairs of the symposium.
Polynomial Value Iteration Algorithms for Detrerminstic MDPs
Madani, Omid
Value iteration is a commonly used and empirically competitive method in solving many Markov decision process problems. However, it is known that value iteration has only pseudo-polynomial complexity in general. We establish a somewhat surprising polynomial bound for value iteration on deterministic Markov decision (DMDP) problems. We show that the basic value iteration procedure converges to the highest average reward cycle on a DMDP problem in heta(n^2) iterations, or heta(mn^2) total time, where n denotes the number of states, and m the number of edges. We give two extensions of value iteration that solve the DMDP in heta(mn) time. We explore the analysis of policy iteration algorithms and report on an empirical study of value iteration showing that its convergence is much faster on random sparse graphs.
Budgeted Learning of Naive-Bayes Classifiers
Lizotte, Daniel J., Madani, Omid, Greiner, Russell
Frequently, acquiring training data has an associated cost. We consider the situation where the learner may purchase data during training, subject TO a budget. IN particular, we examine the CASE WHERE each feature label has an associated cost, AND the total cost OF ALL feature labels acquired during training must NOT exceed the budget.This paper compares methods FOR choosing which feature label TO purchase next, given the budget AND the CURRENT belief state OF naive Bayes model parameters.Whereas active learning has traditionally focused ON myopic(greedy) strategies FOR query selection, this paper presents a tractable method FOR incorporating knowledge OF the budget INTO the decision making process, which improves performance.
Active Model Selection
Madani, Omid, Lizotte, Daniel J., Greiner, Russell
Classical learning assumes the learner is given a labeled data sample, from which it learns a model. The field of Active Learning deals with the situation where the learner begins not with a training sample, but instead with resources that it can use to obtain information to help identify the optimal model. To better understand this task, this paper presents and analyses the simplified "(budgeted) active model selection" version, which captures the pure exploration aspect of many active learning problems in a clean and simple problem formulation. Here the learner can use a fixed budget of "model probes" (where each probe evaluates the specified model on a random indistinguishable instance) to identify which of a given set of possible models has the highest expected accuracy. Our goal is a policy that sequentially determines which model to probe next, based on the information observed so far. We present a formal description of this task, and show that it is NPhard in general. We then investigate a number of algorithms for this task, including several existing ones (eg, "Round-Robin", "Interval Estimation", "Gittins") as well as some novel ones (e.g., "Biased-Robin"), describing first their approximation properties and then their empirical performance on various problem instances. We observe empirically that the simple biased-robin algorithm significantly outperforms the other algorithms in the case of identical costs and priors.
An Empirical Comparison of Algorithms for Aggregating Expert Predictions
Dani, Varsha, Madani, Omid, Pennock, David M, Sanghai, Sumit, Galebach, Brian
Predicting the outcomes of future events is a challenging problem for which a variety of solution methods have been explored and attempted. We present an empirical comparison of a variety of online and offline adaptive algorithms for aggregating experts' predictions of the outcomes of five years of US National Football League games (1319 games) using expert probability elicitations obtained from an Internet contest called ProbabilitySports. We find that it is difficult to improve over simple averaging of the predictions in terms of prediction accuracy, but that there is room for improvement in quadratic loss. Somewhat surprisingly, a Bayesian estimation algorithm which estimates the variance of each expert's prediction exhibits the most consistent superior performance over simple averaging among our collection of algorithms.