Data Mining
Polarimetric SAR Image Smoothing with Stochastic Distances
Torres, Leonardo, Medeiros, Antonio C., Frery, Alejandro C.
Polarimetric Synthetic Aperture Radar (PolSAR) images are establishing as an important source of information in remote sensing applications. The most complete format this type of imaging produces consists of complex-valued Hermitian matrices in every image coordinate and, as such, their visualization is challenging. They also suffer from speckle noise which reduces the signal-to-noise ratio. Smoothing techniques have been proposed in the literature aiming at preserving different features and, analogously, projections from the cone of Hermitian positive matrices to different color representation spaces are used for enhancing certain characteristics. In this work we propose the use of stochastic distances between models that describe this type of data in a Nagao-Matsuyama-type of smoothing technique. The resulting images are shown to present good visualization properties (noise reduction with preservation of fine details) in all the considered visualization spaces.
NewsFinder: Automating an AI News Service
Eckroth, Joshua (The Ohio State University) | Dong, Liang (Clemson University) | Smith, Reid G. (Marathon Oil Corporation) | Buchanan, Bruce G. (University of Pittsburgh)
NewsFinder automates the steps involved in finding, selecting, categorizing, and publishing news stories that meet relevance criteria for the Artificial Intelligence community. The software combines a broad search of online news sources with topic-specific trained models and heuristics. Since August 2010, the program has been used to operate the AI in the News service that is part of the AAAI AITopics website.
The Best of AI in Japan โ Prologue
Nishida, Toyoaki (Kyoto University)
This article is the first report in the best of AI in Japan series. This series will focus on the prominent accomplishments made in the AI field, not only the research and development but also the AI-related events in society. As the first in the forthcoming series, this opening article features a historical background and the contemporary AI-research activities in Japan. It then highlights some recent prominent results from the industry. Finally, a future perspective is given.
Nonparametric Link Prediction in Dynamic Networks
Sarkar, Purnamrita, Chakrabarti, Deepayan, Jordan, Michael
We propose a non-parametric link prediction algorithm for a sequence of graph snapshots over time. The model predicts links based on the features of its endpoints, as well as those of the local neighborhood around the endpoints. This allows for different types of neighborhoods in a graph, each with its own dynamics (e.g, growing or shrinking communities). We prove the consistency of our estimator, and give a fast implementation based on locality-sensitive hashing. Experiments with simulated as well as five real-world dynamic graphs show that we outperform the state of the art, especially when sharp fluctuations or non-linearities are present.
Canonical Trends: Detecting Trend Setters in Web Data
Biessmann, Felix, Papaioannou, Jens-Michalis, Braun, Mikio, Harth, Andreas
Much information available on the web is copied, reused or rephrased. The phenomenon that multiple web sources pick up certain information is often called trend. A central problem in the context of web data mining is to detect those web sources that are first to publish information which will give rise to a trend. We present a simple and efficient method for finding trends dominating a pool of web sources and identifying those web sources that publish the information relevant to a trend before others. We validate our approach on real data collected from influential technology news feeds.
Parallelizing Exploration-Exploitation Tradeoffs with Gaussian Process Bandit Optimization
Desautels, Thomas, Krause, Andreas, Burdick, Joel
Can one parallelize complex exploration exploitation tradeoffs? As an example, consider the problem of optimal high-throughput experimental design, where we wish to sequentially design batches of experiments in order to simultaneously learn a surrogate function mapping stimulus to response and identify the maximum of the function. We formalize the task as a multi-armed bandit problem, where the unknown payoff function is sampled from a Gaussian process (GP), and instead of a single arm, in each round we pull a batch of several arms in parallel. We develop GP-BUCB, a principled algorithm for choosing batches, based on the GP-UCB algorithm for sequential GP optimization. We prove a surprising result; as compared to the sequential approach, the cumulative regret of the parallel algorithm only increases by a constant factor independent of the batch size B. Our results provide rigorous theoretical support for exploiting parallelism in Bayesian global optimization. We demonstrate the effectiveness of our approach on two real-world applications.
Hierarchical Exploration for Accelerating Contextual Bandits
Yue, Yisong, Hong, Sue Ann, Guestrin, Carlos
Contextual bandit learning is an increasingly popular approach to optimizing recommender systems via user feedback, but can be slow to converge in practice due to the need for exploring a large feature space. In this paper, we propose a coarse-to-fine hierarchical approach for encoding prior knowledge that drastically reduces the amount of exploration required. Intuitively, user preferences can be reasonably embedded in a coarse low-dimensional feature space that can be explored efficiently, requiring exploration in the high-dimensional space only as necessary. We introduce a bandit algorithm that explores within this coarse-to-fine spectrum, and prove performance guarantees that depend on how well the coarse space captures the user's preferences. We demonstrate substantial improvement over conventional bandit algorithms through extensive simulation as well as a live user study in the setting of personalized news recommendation.
The Big Data Bootstrap
Kleiner, Ariel, Talwalkar, Ameet, Sarkar, Purnamrita, Jordan, Michael
The bootstrap provides a simple and powerful means of assessing the quality of estimators. However, in settings involving large datasets, the computation of bootstrap-based quantities can be prohibitively demanding. As an alternative, we present the Bag of Little Bootstraps (BLB), a new procedure which incorporates features of both the bootstrap and subsampling to obtain a robust, computationally efficient means of assessing estimator quality. BLB is well suited to modern parallel and distributed computing architectures and retains the generic applicability, statistical efficiency, and favorable theoretical properties of the bootstrap. We provide the results of an extensive empirical and theoretical investigation of BLB's behavior, including a study of its statistical correctness, its large-scale implementation and performance, selection of hyperparameters, and performance on real data.
Online Bandit Learning against an Adaptive Adversary: from Regret to Policy Regret
Arora, Raman, Dekel, Ofer, Tewari, Ambuj
Online learning algorithms are designed to learn even when their input is generated by an adversary. The widely-accepted formal definition of an online algorithm's ability to learn is the game-theoretic notion of regret. We argue that the standard definition of regret becomes inadequate if the adversary is allowed to adapt to the online algorithm's actions. We define the alternative notion of policy regret, which attempts to provide a more meaningful way to measure an online algorithm's performance against adaptive adversaries. Focusing on the online bandit setting, we show that no bandit algorithm can guarantee a sublinear policy regret against an adaptive adversary with unbounded memory. On the other hand, if the adversary's memory is bounded, we present a general technique that converts any bandit algorithm with a sublinear regret bound into an algorithm with a sublinear policy regret bound. We extend this result to other variants of regret, such as switching regret, internal regret, and swap regret.