Goto

Collaborating Authors

 lista


Hyperparameter Tuning is All You Need for LISTA

Neural Information Processing Systems

Learned Iterative Shrinkage-Thresholding Algorithm (LISTA) introduces the concept of unrolling an iterative algorithm and training it like a neural network. It has had great success on sparse recovery. In this paper, we show that adding momentum to intermediate variables in the LISTA network achieves a better convergence rate and, in particular, the network with instance-optimal parameters is superlinearly convergent. Moreover, our new theoretical results lead to a practical approach of automatically and adaptively calculating the parameters of a LISTA network layer based on its previous layers. Perhaps most surprisingly, such an adaptive-parameter procedure reduces the training of LISTA to tuning only three hyperparameters from data: a new record set in the context of the recent advances on trimming down LISTA complexity. We call this new ultra-light weight network HyperLISTA. Compared to state-of-the-art LISTA models, HyperLISTA achieves almost the same performance on seen data distributions and performs better when tested on unseen distributions (specifically, those with different sparsity levels and nonzero magnitudes).


Review for NeurIPS paper: Learning to solve TV regularised problems with unrolled algorithms

Neural Information Processing Systems

Summary and Contributions: The paper considers the problem of accelerating TV-regularized problems. The paper first shows that assuming random Gaussian design, the analysis formulation might converge faster than the synthesis formulation based on the usual convergence rate for PGD. The paper then proposes two methods to accelerate the analysis formulation, using unrolling as done in LISTA. With regular lasso, the proximal operator is soft thresholding and back propagation is easy. With TV, backpropagation is a bit more difficult, and the paper proposes two alternatives.


Hyperparameter Tuning is All You Need for LISTA

Neural Information Processing Systems

Learned Iterative Shrinkage-Thresholding Algorithm (LISTA) introduces the concept of unrolling an iterative algorithm and training it like a neural network. It has had great success on sparse recovery. In this paper, we show that adding momentum to intermediate variables in the LISTA network achieves a better convergence rate and, in particular, the network with instance-optimal parameters is superlinearly convergent. Moreover, our new theoretical results lead to a practical approach of automatically and adaptively calculating the parameters of a LISTA network layer based on its previous layers. Perhaps most surprisingly, such an adaptive-parameter procedure reduces the training of LISTA to tuning only three hyperparameters from data: a new record set in the context of the recent advances on trimming down LISTA complexity.


Reviews: An inner-loop free solution to inverse problems using deep neural networks

Neural Information Processing Systems

An interesting paper that solves linear inverse problems using a combination of two networks: one that learns a proximal operator to the signal class of interest, and the other that serves as a proxy for a large scale matrix inversion. The proximal operator is reusable whenever the signal domain is unchanged. One would need to retrain only the matrix inversion network when the underlying problem is changed. This is a significant advantage towards reusability of the training procedure. Strengths Novelty A very interesting problem Weaknesses - An important reference is missing - Other less important references are missing - Bare-bones evaluation The paper provides an approach to solve linear inverse problems by reducing training requirements.


Robust Stochastically-Descending Unrolled Networks

Hadou, Samar, NaderiAlizadeh, Navid, Ribeiro, Alejandro

arXiv.org Artificial Intelligence

Deep unrolling, or unfolding, is an emerging learning-to-optimize method that unrolls a truncated iterative algorithm in the layers of a trainable neural network. However, the convergence guarantees and generalizability of the unrolled networks are still open theoretical problems. To tackle these problems, we provide deep unrolled architectures with a stochastic descent nature by imposing descending constraints during training. The descending constraints are forced layer by layer to ensure that each unrolled layer takes, on average, a descent step toward the optimum during training. We theoretically prove that the sequence constructed by the outputs of the unrolled layers is then guaranteed to converge for unseen problems, assuming no distribution shift between training and test problems. We also show that standard unrolling is brittle to perturbations, and our imposed constraints provide the unrolled networks with robustness to additive noise and perturbations. We numerically assess unrolled architectures trained under the proposed constraints in two different applications, including the sparse coding using learnable iterative shrinkage and thresholding algorithm (LISTA) and image inpainting using proximal generative flow (GLOW-Prox), and demonstrate the performance and robustness benefits of the proposed method.


Learned ISTA with Error-based Thresholding for Adaptive Sparse Coding

Li, Ziang, Wu, Kailun, Guo, Yiwen, Zhang, Changshui

arXiv.org Artificial Intelligence

Also, it leads to poor generalization to which utilizes a function of the layer-wise reconstruction error test data with a different distribution (or sparsity) from the to suggest a specific threshold for each observation in the training data. To address the above issues, we propose an shrinkage function of each layer. We show that the proposed error-based thresholding (EBT) mechanism of LISTA-based EBT mechanism well disentangles the learnable parameters models to improve their adaptivity. EBT introduces a function in the shrinkage functions from the reconstruction errors, endowing of the evolving estimation error to provide each threshold the obtained models with improved adaptivity to possible in the shrinkage functions in the model. It has no extra learnable data variations. With rigorous analyses, we further show parameter compared with original LISTA-based models, that the proposed EBT also leads to a faster convergence on yet shows significantly better performance.


Uncertainty quantification for learned ISTA

Hoppe, Frederik, Verdun, Claudio Mayrink, Krahmer, Felix, Laus, Hannah, Rauhut, Holger

arXiv.org Machine Learning

Model-based deep learning solutions to inverse problems have attracted increasing attention in recent years as they bridge state-of-the-art numerical performance with interpretability. In addition, the incorporated prior domain knowledge can make the training more efficient as the smaller number of parameters allows the training step to be executed with smaller datasets. Algorithm unrolling schemes stand out among these model-based learning techniques. Despite their rapid advancement and their close connection to traditional high-dimensional statistical methods, they lack certainty estimates and a theory for uncertainty quantification is still elusive. This work provides a step towards closing this gap proposing a rigorous way to obtain confidence intervals for the LISTA estimator.


Solving Sparse Linear Inverse Problems in Communication Systems: A Deep Learning Approach With Adaptive Depth

Chen, Wei, Zhang, Bowen, Jin, Shi, Ai, Bo, Zhong, Zhangdui

arXiv.org Artificial Intelligence

Sparse signal recovery problems from noisy linear measurements appear in many areas of wireless communications. In recent years, deep learning (DL) based approaches have attracted interests of researchers to solve the sparse linear inverse problem by unfolding iterative algorithms as neural networks. Typically, research concerning DL assume a fixed number of network layers. However, it ignores a key character in traditional iterative algorithms, where the number of iterations required for convergence changes with varying sparsity levels. By investigating on the projected gradient descent, we unveil the drawbacks of the existing DL methods with fixed depth. Then we propose an end-to-end trainable DL architecture, which involves an extra halting score at each layer. Therefore, the proposed method learns how many layers to execute to emit an output, and the network depth is dynamically adjusted for each task in the inference phase. We conduct experiments using both synthetic data and applications including random access in massive MTC and massive MIMO channel estimation, and the results demonstrate the improved efficiency for the proposed approach.


Learned Greedy Method (LGM): A Novel Neural Architecture for Sparse Coding and Beyond

Khatib, Rajaei, Simon, Dror, Elad, Michael

arXiv.org Machine Learning

The fields of signal and image processing have been deeply influenced by the introduction of deep neural networks. These are successfully deployed in a wide range of real-world applications, obtaining state of the art results and surpassing well-known and well-established classical methods. Despite their impressive success, the architectures used in many of these neural networks come with no clear justification. As such, these are usually treated as "black box" machines that lack any kind of interpretability. A constructive remedy to this drawback is a systematic design of such networks by unfolding well-understood iterative algorithms. A popular representative of this approach is the Iterative Shrinkage-Thresholding Algorithm (ISTA) and its learned version -- LISTA, aiming for the sparse representations of the processed signals. In this paper we revisit this sparse coding task and propose an unfolded version of a greedy pursuit algorithm for the same goal. More specifically, we concentrate on the well-known Orthogonal-Matching-Pursuit (OMP) algorithm, and introduce its unfolded and learned version. Key features of our Learned Greedy Method (LGM) are the ability to accommodate a dynamic number of unfolded layers, and a stopping mechanism based on representation error, both adapted to the input. We develop several variants of the proposed LGM architecture and test some of them in various experiments, demonstrating their flexibility and efficiency.


Learning step sizes for unfolded sparse coding

Ablin, Pierre, Moreau, Thomas, Massias, Mathurin, Gramfort, Alexandre

arXiv.org Machine Learning

Sparse coding is typically solved by iterative optimization techniques, such as the Iterative Shrinkage-Thresholding Algorithm (ISTA). Unfolding and learning weights of ISTA using neural networks is a practical way to accelerate estimation. In this paper, we study the selection of adapted step sizes for ISTA. We show that a simple step size strategy can improve the convergence rate of ISTA by leveraging the sparsity of the iterates. However, it is impractical in most large-scale applications. Therefore, we propose a network architecture where only the step sizes of ISTA are learned. We demonstrate that for a large class of unfolded algorithms, if the algorithm converges to the solution of the Lasso, its last layers correspond to ISTA with learned step sizes. Experiments show that our method is competitive with state-of-the-art networks when the solutions are sparse enough.