Goto

Collaborating Authors

 Data Mining


Learning Retrospective Knowledge with Reverse Reinforcement Learning

Neural Information Processing Systems

We present a Reverse Reinforcement Learning (Reverse RL) approach for representing retrospective knowledge. General Value Functions (GVFs) have enjoyed great success in representing predictive knowledge, i.e., answering questions about possible future outcomes such as "how much fuel will be consumed in expectation if we drive from A to B?". GVFs, however, cannot answer questions like "how much fuel do we expect a car to have given it is at B at time t?". To answer this question, we need to know when that car had a full tank and how that car came to B. Since such questions emphasize the influence of possible past events on the present, we refer to their answers as retrospective knowledge. In this paper, we show how to represent retrospective knowledge with Reverse GVFs, which are trained via Reverse RL. We demonstrate empirically the utility of Reverse GVFs in both representation learning and anomaly detection.


LinGCN: Structural Linearized Graph Convolutional Network for Homomorphically Encrypted Inference

Neural Information Processing Systems

The growth of Graph Convolution Network (GCN) model sizes has revolutionized numerous applications, surpassing human performance in areas such as personal healthcare and financial systems. The deployment of GCNs in the cloud raises privacy concerns due to potential adversarial attacks on client data. To address security concerns, Privacy-Preserving Machine Learning (PPML) using Homomorphic Encryption (HE) secures sensitive client data. However, it introduces substantial computational overhead in practical applications. To tackle those challenges, we present LinGCN, a framework designed to reduce multiplication depth and optimize the performance of HE based GCN inference.


Recovering Unbalanced Communities in the Stochastic Block Model with Application to Clustering with a Faulty Oracle

Neural Information Processing Systems

The stochastic block model (SBM) is a fundamental model for studying graph clustering or community detection in networks. It has received great attention in the last decade and the balanced case, i.e., assuming all clusters have large size, has been well studied. However, our understanding of SBM with unbalanced communities (arguably, more relevant in practice) is still limited. In this paper, we provide a simple SVD-based algorithm for recovering the communities in the SBM with communities of varying sizes.We improve upon a result of Ailon, Chen and Xu [ICML 2013; JMLR 2015] by removing the assumption that there is a large interval such that the sizes of clusters do not fall in, and also remove the dependency of the size of the recoverable clusters on the number of underlying clusters. We further complement our theoretical improvements with experimental comparisons.Under the planted clique conjecture, the size of the clusters that can be recovered by our algorithm is nearly optimal (up to poly-logarithmic factors) when the probability parameters are constant. As a byproduct, we obtain an efficient clustering algorithm with sublinear query complexity in a faulty oracle model, which is capable of detecting all clusters larger than \tilde{\Omega}({\sqrt{n}}), even in the presence of \Omega(n) small clusters in the graph.


Doubly Robust Thompson Sampling with Linear Payoffs

Neural Information Processing Systems

A challenging aspect of the bandit problem is that a stochastic reward is observed only for the chosen arm and the rewards of other arms remain missing. The dependence of the arm choice on the past context and reward pairs compounds the complexity of regret analysis.We propose a novel multi-armed contextual bandit algorithm called Doubly Robust Thompson Sampling (DRTS) employing the doubly-robust estimator used in missing data literature to Thompson Sampling with contexts (\texttt{LinTS}).Different from previous works relying on missing data techniques (Dimakopoulou et al. [2019], Kim and Paik [2019]), the proposed algorithm is designed to allow a novel additive regret decomposition leading to an improved regret bound with the order of \tilde{O}(\phi {-2}\sqrt{T}), where \phi 2 is the minimum eigenvalue of the covariance matrix of contexts.This is the first regret bound of \texttt{LinTS} using \phi 2 without d, where d is the dimension of the context.Applying the relationship between \phi 2 and d, the regret bound of the proposed algorithm is \tilde{O}(d\sqrt{T}) in many practical scenarios, improving the bound of \texttt{LinTS} by a factor of \sqrt{d} .A benefit of the proposed method is that it uses all the context data, chosen or not chosen, thus allowing to circumvent the technical definition of unsaturated arms used in theoretical analysis of \texttt{LinTS}.Empirical studies show the advantage of the proposed algorithm over \texttt{LinTS}.


Online Learning in Contextual Bandits using Gated Linear Networks

Neural Information Processing Systems

We introduce a new and completely online contextual bandit algorithm called Gated Linear Contextual Bandits (GLCB). This algorithm is based on Gated Linear Networks (GLNs), a recently introduced deep learning architecture with properties well-suited to the online setting. Leveraging data-dependent gating properties of the GLN we are able to estimate prediction uncertainty with effectively zero algorithmic overhead. We empirically evaluate GLCB compared to 9 state-of-the-art algorithms that leverage deep neural networks, on a standard benchmark suite of discrete and continuous contextual bandit problems. GLCB obtains mean first-place despite being the only online method, and we further support these results with a theoretical study of its convergence properties.


Fighting Bandits with a New Kind of Smoothness

Neural Information Processing Systems

We focus on the adversarial multi-armed bandit problem. The EXP3 algorithm of Auer et al. (2003) was shown to have a regret bound of O(\sqrt{T N \log N}), where T is the time horizon and N is the number of available actions (arms). More recently, Audibert and Bubeck (2009) improved the bound by a logarithmic factor via an entirely different method. In the present work, we provide a new set of analysis tools, using the notion of convex smoothing, to provide several novel algorithms with optimal guarantees. First we show that regularization via the Tsallis entropy matches the minimax rate of Audibert and Bubeck (2009) with an even tighter constant; it also fully generalizes EXP3.


Dimensionality Reduction for Wasserstein Barycenter

Neural Information Processing Systems

The Wasserstein barycenter is a geometric construct which captures the notion of centrality among probability distributions, and which has found many applications in machine learning. However, most algorithms for finding even an approximate barycenter suffer an exponential dependence on the dimension d of the underlying space of the distributions. In order to cope with this curse of dimensionality,'' we study dimensionality reduction techniques for the Wasserstein barycenter problem. When the barycenter is restricted to support of size n, we show that randomized dimensionality reduction can be used to map the problem to a space of dimension O(\log n) independent of both d and k, and that \emph{any} solution found in the reduced dimension will have its cost preserved up to arbitrary small error in the original space. We provide matching upper and lower bounds on the size of the reduced dimension, showing that our methods are optimal up to constant factors.


Tuning Mixed Input Hyperparameters on the Fly for Efficient Population Based AutoRL

Neural Information Processing Systems

Despite a series of recent successes in reinforcement learning (RL), many RL algorithms remain sensitive to hyperparameters. As such, there has recently been interest in the field of AutoRL, which seeks to automate design decisions to create more general algorithms. Recent work suggests that population based approaches may be effective AutoRL algorithms, by learning hyperparameter schedules on the fly. In particular, the PB2 algorithm is able to achieve strong performance in RL tasks by formulating online hyperparameter optimization as time varying GP-bandit problem, while also providing theoretical guarantees. However, PB2 is only designed to work for \emph{continuous} hyperparameters, which severely limits its utility in practice.


Projection Regret: Reducing Background Bias for Novelty Detection via Diffusion Models

Neural Information Processing Systems

Novelty detection is a fundamental task of machine learning which aims to detect abnormal (i.e. Since diffusion models have recently emerged as the de facto standard generative framework with surprising generation results, novelty detection via diffusion models has also gained much attention. Recent methods have mainly utilized the reconstruction property of in-distribution samples. However, they often suffer from detecting OOD samples that share similar background information to the in-distribution data. Based on our observation that diffusion models can project any sample to an in-distribution sample with similar background information, we propose Projection Regret (PR), an efficient novelty detection method that mitigates the bias of non-semantic information. To be specific, PR computes the perceptual distance between the test image and its diffusion-based projection to detect abnormality.


Live Graph Lab: Towards Open, Dynamic and Real Transaction Graphs with NFT

Neural Information Processing Systems

Numerous studies have been conducted to investigate the properties of large-scale temporal graphs. Despite the ubiquity of these graphs in real-world scenarios, it's usually impractical for us to obtain the whole real-time graphs due to privacy concerns and technical limitations. In this paper, we introduce the concept of {\it Live Graph Lab} for temporal graphs, which enables open, dynamic and real transaction graphs from blockchains. Among them, Non-fungible tokens (NFTs) have become one of the most prominent parts of blockchain over the past several years. With more than \ 40 billion market capitalization, this decentralized ecosystem produces massive, anonymous and real transaction activities, which naturally forms a complicated transaction network. However, there is limited understanding about the characteristics of this emerging NFT ecosystem from a temporal graph analysis perspective.