Goto

Collaborating Authors

 lll



Neural Lattice Reduction: A Self-Supervised Geometric Deep Learning Approach

arXiv.org Artificial Intelligence

Lattice reduction is a combinatorial optimization problem aimed at finding the most orthogonal basis in a given lattice. The Lenstra-Lenstra-Lov\'asz (LLL) algorithm is the best algorithm in the literature for solving this problem. In light of recent research on algorithm discovery, in this work, we would like to answer this question: is it possible to parametrize the algorithm space for lattice reduction problem with neural networks and find an algorithm without supervised data? Our strategy is to use equivariant and invariant parametrizations and train in a self-supervised way. We design a deep neural model outputting factorized unimodular matrices and train it in a self-supervised manner by penalizing non-orthogonal lattice bases. We incorporate the symmetries of lattice reduction into the model by making it invariant to isometries and scaling of the ambient space and equivariant with respect to the hyperocrahedral group permuting and flipping the lattice basis elements. We show that this approach yields an algorithm with comparable complexity and performance to the LLL algorithm on a set of benchmarks. Additionally, motivated by certain applications for wireless communication, we extend our method to a convolutional architecture which performs joint reduction of spatially-correlated lattices arranged in a grid, thereby amortizing its cost over multiple lattices.


Video Denoising in Fluorescence Guided Surgery

arXiv.org Artificial Intelligence

Fluorescence guided surgery (FGS) is a promising surgical technique that gives surgeons a unique view of tissue that is used to guide their practice by delineating tissue types and diseased areas. As new fluorescent contrast agents are developed that have low fluorescent photon yields, it becomes increasingly important to develop computational models to allow FGS systems to maintain good video quality in real time environments. To further complicate this task, FGS has a difficult bias noise term from laser leakage light (LLL) that represents unfiltered excitation light that can be on the order of the fluorescent signal. Most conventional video denoising methods focus on zero mean noise, and non-causal processing, both of which are violated in FGS. Luckily in FGS, often a co-located reference video is also captured which we use to simulate the LLL and assist in the denoising processes. In this work, we propose an accurate noise simulation pipeline that includes LLL and propose three baseline deep learning based algorithms for FGS video denoising.


Reviews: The Sparse Manifold Transform

Neural Information Processing Systems

This paper presents an unsupervised learning method called sparse manifold transform for learning low-dimensional structures from data. The proposed approach is based on combining sparse coding and manifold embedding. The properties of the learned representations are demonstrated on synthetic and real data. Overall, this paper is a long-winded presentation of a new manifold learning approach, which (at least in my view) is explained in heuristic and vague terms and seems very hard to perceive. Conventional manifold learning makes the assumption that the high dimensional data points lie close to a low dimensional manifold, and different manifold learning methods find low-dimensional embeddings that preserve different local/global properties of data.


Crowdsourced Clustering: Querying Edges vs Triangles Babak Hassibi Department of Electrical Engineering Department of Electrical Engineering Caltech, Pasadena

Neural Information Processing Systems

We consider the task of clustering items using answers from non-expert crowd workers. In such cases, the workers are often not able to label the items directly, however, it is reasonable to assume that they can compare items and judge whether they are similar or not. An important question is what queries to make, and we compare two types: random edge queries, where a pair of items is revealed, and random triangles, where a triple is. Since it is far too expensive to query all possible edges and/or triangles, we need to work with partial observations subject to a fixed query budget constraint. When a generative model for the data is available (and we consider a few of these) we determine the cost of a query by its entropy; when such models do not exist we use the average response time per query of the workers as a surrogate for the cost. In addition to theoretical justification, through several simulations and experiments on two real data sets on Amazon Mechanical Turk, we empirically demonstrate that, for a fixed budget, triangle queries uniformly outperform edge queries. Even though, in contrast to edge queries, triangle queries reveal dependent edges, they provide more reliable edges and, for a fixed budget, many more of them. We also provide a sufficient condition on the number of observations, edge densities inside and outside the clusters and the minimum cluster size required for the exact recovery of the true adjacency matrix via triangle queries using a convex optimization-based clustering algorithm.


Lifelong Generative Modelling Using Dynamic Expansion Graph Model

arXiv.org Artificial Intelligence

Variational Autoencoders (VAEs) suffer from degenerated performance, when learning several successive tasks. This is caused by catastrophic forgetting. In order to address the knowledge loss, VAEs are using either Generative Replay (GR) mechanisms or Expanding Network Architectures (ENA). In this paper we study the forgetting behaviour of VAEs using a joint GR and ENA methodology, by deriving an upper bound on the negative marginal log-likelihood. This theoretical analysis provides new insights into how VAEs forget the previously learnt knowledge during lifelong learning. The analysis indicates the best performance achieved when considering model mixtures, under the ENA framework, where there are no restrictions on the number of components. However, an ENA-based approach may require an excessive number of parameters. This motivates us to propose a novel Dynamic Expansion Graph Model (DEGM). DEGM expands its architecture, according to the novelty associated with each new databases, when compared to the information already learnt by the network from previous tasks. DEGM training optimizes knowledge structuring, characterizing the joint probabilistic representations corresponding to the past and more recently learned tasks. We demonstrate that DEGM guarantees optimal performance for each task while also minimizing the required number of parameters. Supplementary materials (SM) and source code are available in https://github.com/dtuzi123/Expansion-Graph-Model.


Rejoinder: New Objectives for Policy Learning

arXiv.org Machine Learning

I would like thank the discussants, Oliver Dukes and Stijn Vansteelandt (DV), Sijia Li, Xiudi Li, Alex Luedtkeand (LLL), and Muxuan Liang and Yingqi Zhao (LZ), for a very thoughtful discussion both of my contribution (Kallus 2020) and of Mo et al. (2020). I similarly thank the editors for putting together this exciting special issue and for curating a timely discussion on new objectives for policy learning. I found the juxtaposition between the two papers particularly apt: while my paper tries to induce an optimal covariate shift based on the premise of invariance, Mo et al. (2020) try to be robust to an undesirable covariate shift for fear of variations. While one optimistically alters the training population, the other pessimistically considers the worst-possible testing population. In the following I review some discussant comments that stood out to me as particularly keenly perceptive and offer some reflections.


Path to Stochastic Stability: Comparative Analysis of Stochastic Learning Dynamics in Games

arXiv.org Machine Learning

Stochastic stability is a popular solution concept for stochastic learning dynamics in games. However, a critical limitation of this solution concept is its inability to distinguish between different learning rules that lead to the same steady-state behavior. We address this limitation for the first time and develop a framework for the comparative analysis of stochastic learning dynamics with different update rules but same steady-state behavior. We present the framework in the context of two learning dynamics: Log-Linear Learning (LLL) and Metropolis Learning (ML). Although both of these dynamics have the same stochastically stable states, LLL and ML correspond to different behavioral models for decision making. Moreover, we demonstrate through an example setup of sensor coverage game that for each of these dynamics, the paths to stochastically stable states exhibit distinctive behaviors. Therefore, we propose multiple criteria to analyze and quantify the differences in the short and medium run behavior of stochastic learning dynamics. We derive and compare upper bounds on the expected hitting time to the set of Nash equilibria for both LLL and ML. For the medium to long-run behavior, we identify a set of tools from the theory of perturbed Markov chains that result in a hierarchical decomposition of the state space into collections of states called cycles. We compare LLL and ML based on the proposed criteria and develop invaluable insights into the comparative behavior of the two dynamics.


Crowdsourced Clustering: Querying Edges vs Triangles

Neural Information Processing Systems

We consider the task of clustering items using answers from non-expert crowd workers. In such cases, the workers are often not able to label the items directly, however, it is reasonable to assume that they can compare items and judge whether they are similar or not. An important question is what queries to make, and we compare two types: random edge queries, where a pair of items is revealed, and random triangles, where a triple is. Since it is far too expensive to query all possible edges and/or triangles, we need to work with partial observations subject to a fixed query budget constraint. When a generative model for the data is available (and we consider a few of these) we determine the cost of a query by its entropy; when such models do not exist we use the average response time per query of the workers as a surrogate for the cost. In addition to theoretical justification, through several simulations and experiments on two real data sets on Amazon Mechanical Turk, we empirically demonstrate that, for a fixed budget, triangle queries uniformly outperform edge queries. Even though, in contrast to edge queries, triangle queries reveal dependent edges, they provide more reliable edges and, for a fixed budget, many more of them. We also provide a sufficient condition on the number of observations, edge densities inside and outside the clusters and the minimum cluster size required for the exact recovery of the true adjacency matrix via triangle queries using a convex optimization-based clustering algorithm.