AITopics

We formulate and study a new variant of the $k$-armed bandit problem, motivated by e-commerce applications. In our model, arms have (stochastic) lifetime after which they expire. In this setting an algorithm needs to continuously explore new arms, in contrast to the standard $k$-armed bandit model in which arms are available indefinitely and exploration is reduced once an optimal arm is identified with near-certainty. The main motivation for our setting is online-advertising, where ads have limited lifetime due to, for example, the nature of their content and their campaign budget. An algorithm needs to choose among a large collection of ads, more than can be fully explored within the ads' lifetime. We present an optimal algorithm for the state-aware (deterministic reward function) case, and build on this technique to obtain an algorithm for the state-oblivious (stochastic reward function) case. Empirical studies on various reward distributions, including one derived from a real-world ad serving application, show that the proposed algorithms significantly outperform the standard multi-armed bandit approaches applied to these settings.

algorithm, artificial intelligence, big data, (21 more...)

Country: North America > United States (0.14)

Industry:

Marketing (0.89)
Information Technology > Services (0.54)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence (1.00)

Factor Modeling for Advertisement Targeting

Chen, Ye, Kapralov, Michael, Canny, John, Pavlov, Dmitry Y.

We adapt a probabilistic latent variable model, namely GaP (Gamma-Poisson), to ad targeting in the contexts of sponsored search (SS) and behaviorally targeted (BT) display advertising. We also approach the important problem of ad positional bias by formulating a one-latent-dimension GaP factorization. Learning from click-through data is intrinsically large scale, even more so for ads. We scale up the algorithm to terabytes of real-world SS and BT data that contains hundreds of millions of users and hundreds of thousands of features, by leveraging the scalability characteristics of the algorithm and the inherent structure of the problem including data sparsity and locality. Specifically, we demonstrate two somewhat orthogonal philosophies of scaling algorithms to large-scale problems, through the SS and BT implementations, respectively. Finally, we report the experimental results using Yahoos vast datasets, and show that our approach substantially outperform the state-of-the-art methods in prediction accuracy. For BT in particular, the ROC area achieved by GaP is exceeding 0.95, while one prior approach using Poisson regression yielded 0.83. For computational performance, we compare a single-node sparse implementation with a parallel implementation using Hadoop MapReduce, the results are counterintuitive yet quite interesting. We therefore provide insights into the underlying principles of large-scale learning.

artificial intelligence, information management, prediction, (21 more...)

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.66)

Industry:

Marketing (1.00)
Information Technology > Services (0.88)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Venkataraman, Shobha, Blum, Avrim, Song, Dawn, Sen, Subhabrata, Spatscheck, Oliver

Tracking Dynamic Sources of Malicious Activity at Internet Scale

We formulate and address the problem of discovering dynamic malicious regions on the Internet. We model this problem as one of adaptively pruning a known decision tree, but with additional challenges: (1) severe space requirements, since the underlying decision tree has over 4 billion leaves, and (2) a changing target function, since malicious activity on the Internet is dynamic. We present a novel algorithm that addresses this problem, by putting together a number of different ``experts algorithms and online paging algorithms. We prove guarantees on our algorithms performance as a function of the best possible pruning of a similar size, and our experiments show that our algorithm achieves high accuracy on large real-world data sets, with significant improvements over existing approaches.

algorithm, artificial intelligence, machine learning, (18 more...)

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Zhang, Liang, Agarwal, Deepak

Fast Computation of Posterior Mode in Multi-Level Hierarchical Models

Multilevel hierarchical models provide an attractive framework for incorporating correlations induced in a response variable that is organized hierarchically. Model fitting is challenging, especially for a hierarchy with a large number of nodes. We provide a novel algorithm based on a multi-scale Kalman filter that is both scalable and easy to implement. For Gaussian response, we show our method provides the maximum a-posteriori (MAP) parameter estimates; for non-Gaussian response, parameter estimation is performed through a Laplace approximation. However, the Laplace approximation provides biased parameter estimates that is corrected through a parametric bootstrap procedure. We illustrate through simulation studies and analyses of real world data sets in health care and online advertising.

artificial intelligence, bayesian inference, hierarchy, (19 more...)

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Marketing (0.67)
Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Meka, Raghu, Jain, Prateek, Dhillon, Inderjit S.

Matrix Completion from Power-Law Distributed Samples

The low-rank matrix completion problem is a fundamental problem with many important applications. Recently, [4],[13] and [5] obtained the first nontrivial theoretical results for the problem assuming that the observed entries are sampled uniformly at random. Unfortunately, most real-world datasets do not satisfy this assumption, but instead exhibit power-law distributed samples. In this paper, we propose a graph theoretic approach to matrix completion that solves the problem for more realistic sampling models. Our method is simpler to analyze than previous methodswith the analysis reducing to computing the threshold for complete cascades in random graphs, a problem of independent interest. By analyzing the graph theoretic problem, we show that our method achieves exact recovery when the observed entries are sampled from the Chung-Lu-Vu model, which can generate power-lawdistributed graphs. We also hypothesize that our algorithm solves the matrix completion problem from an optimal number of entries for the popular preferentialattachment model and provide strong empirical evidence for the claim. Furthermore, our method is easy to implement and is substantially faster than existing methods. We demonstrate the effectiveness of our method on random instanceswhere the low-rank matrix is sampled according to the prevalent random graph models for complex networks and present promising preliminary results on the Netflix challenge dataset.

artificial intelligence, graph, information technology services, (17 more...)

Country: North America > United States > Texas > Travis County > Austin (0.14)

Industry:

Information Technology > Services (0.36)
Media > Film (0.36)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Chaudhuri, Kamalika, Monteleoni, Claire

Privacy-preserving logistic regression

This paper addresses the important tradeoff between privacy and learnability, when designing algorithms for learning from private databases. First we apply an idea of Dwork et al. to design a specific privacy-preserving machine learning algorithm, logistic regression. This involves bounding the sensitivity of logistic regression, and perturbing the learned classifier with noise proportional to the sensitivity. Noting that the approach of Dwork et al. has limitations when applied to other machine learning algorithms, we then present another privacy-preserving logistic regression algorithm. The algorithm is based on solving a perturbed objective, and does not depend on the sensitivity. We prove that our algorithm preserves privacy in the model due to Dwork et al., and we provide a learning performance guarantee. Our work also reveals an interesting connection between regularization and privacy.

algorithm, artificial intelligence, machine learning, (19 more...)

Country: North America > United States > California (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Goldenberg, Anna, Zheng, Alice X, Fienberg, Stephen E, Airoldi, Edoardo M

A survey of statistical network models

arXiv.org Machine LearningDec-29-2009

Networks are ubiquitous in science and have become a focal point for discussion in everyday life. Formal statistical models for the analysis of network data have emerged as a major topic of interest in diverse areas of study, and most of these involve a form of graphical representation. Probability models on graphs date back to 1959. Along with empirical studies in social psychology and sociology from the 1960s, these early works generated an active network community and a substantial literature in the 1970s. This effort moved into the statistical literature in the late 1970s and 1980s, and the past decade has seen a burgeoning network literature in statistical physics and computer science. The growth of the World Wide Web and the emergence of online networking communities such as Facebook, MySpace, and LinkedIn, and a host of more specialized professional network communities has intensified interest in the study of networks and network data. Our goal in this review is to provide the reader with an entry point to this burgeoning literature. We begin with an overview of the historical development of statistical network modeling and then we introduce a number of examples that have been studied in the network literature. Our subsequent discussion focuses on a number of prominent static and dynamic network models and their interconnections. We emphasize formal model descriptions, and pay special attention to the interpretation of parameters and their estimation. We end with a description of some open problems and challenges for machine learning and statistics.

node, télécommunications, us government, (23 more...)

arXiv.org Machine Learning

0912.5410

Country:

Europe (0.92)
North America > United States > Massachusetts > Middlesex County (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.67)
Research Report > New Finding (0.46)

Industry:

Media (1.00)
Law (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(5 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
(5 more...)

AAAI ConferencesDec-9-2009

AutoMed - An Automated Mediator for Multi-Issue Bilateral Negotiations

Chalamish, Michal (Ashkelon Academic College) | Kraus, Sarit (Bar Ilan University)

In this paper, we present AutoMed, an automated mediator for multi-issue bilateral negotiation under time constraints. AutoMed uses a qualitative model to represent the negotiators' preferences. It analyzes the negotiators' preferences, monitors the negotiations and proposes possible solutions for resolving the conflict. We conducted experiments in a simulated environment. The results show that negotiations mediated by AutoMed are concluded significantly faster than non-mediated ones and without any of the negotiators opting out. Furthermore, the subjects in the mediated negotiations are more satisfied from the resolutions than the subjects in the non-mediated negotiations.

agreement, alternative dispute resolution, health & medicine, (17 more...)

AAAI Conferences

Third International Conference on Computational Cultural Dynamics

Country: North America > United States > Maryland > Prince George's County > College Park (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Government (0.93)
Information Technology (0.93)
Health & Medicine (0.68)
Law > Alternative Dispute Resolution (0.31)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Game Theory (0.70)

Greensmith, Julie, Aickelin, Uwe, Twycross, Jamie

Articulation and Clarification of the Dendritic Cell Algorithm

arXiv.org Artificial IntelligenceOct-26-2009

The Dendritic Cell algorithm (DCA) is inspired by recent work in innate immunity. In this paper a formal description of the DCA is given. The DCA is described in detail, and its use as an anomaly detector is illustrated within the context of computer security. A port scan detection task is performed to substantiate the influence of signal selection on the behaviour of the algorithm. Experimental results provide a comparison of differing input signal mappings.

antigen, health & medicine, immunology, (20 more...)

arXiv.org Artificial Intelligence

0910.4903

Country: Europe > United Kingdom (0.14)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.71)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Software (0.69)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.37)