bcirc
Quantile-Based Randomized Kaczmarz for Corrupted Tensor Linear Systems
Castillo, Alejandra, Haddock, Jamie, Hartsock, Iryna, Hoyos, Paulina, Kassab, Lara, Kryshchenko, Alona, Larripa, Kamila, Needell, Deanna, Suryanarayanan, Shambhavi, Djima, Karamatou Yacoubou
The reconstruction of tensor-valued signals from corrupted measurements, known as tensor regression, has become essential in many multi-modal applications such as hyperspectral image reconstruction and medical imaging. In this work, we address the tensor linear system problem $\mathcal{A} \mathcal{X}=\mathcal{B}$, where $\mathcal{A}$ is a measurement operator, $\mathcal{X}$ is the unknown tensor-valued signal, and $\mathcal{B}$ contains the measurements, possibly corrupted by arbitrary errors. Such corruption is common in large-scale tensor data, where transmission, sensory, or storage errors are rare per instance but likely over the entire dataset and may be arbitrarily large in magnitude. We extend the Kaczmarz method, a popular iterative algorithm for solving large linear systems, to develop a Quantile Tensor Randomized Kaczmarz (QTRK) method robust to large, sparse corruptions in the observations $\mathcal{B}$. This approach combines the tensor Kaczmarz framework with quantile-based statistics, allowing it to mitigate adversarial corruptions and improve convergence reliability. We also propose and discuss the Masked Quantile Randomized Kaczmarz (mQTRK) variant, which selectively applies partial updates to handle corruptions further. We present convergence guarantees, discuss the advantages and disadvantages of our approaches, and demonstrate the effectiveness of our methods through experiments, including an application for video deblurring.
Non-Convex Tensor Recovery from Local Measurements
Wu, Tongle, Sun, Ying, Fan, Jicong
Motivated by the settings where sensing the entire tensor is infeasible, this paper proposes a novel tensor compressed sensing model, where measurements are only obtained from sensing each lateral slice via mutually independent matrices. Leveraging the low tubal rank structure, we reparameterize the unknown tensor ${\boldsymbol {\mathcal X}}^\star$ using two compact tensor factors and formulate the recovery problem as a nonconvex minimization problem. To solve the problem, we first propose an alternating minimization algorithm, termed \textsf{Alt-PGD-Min}, that iteratively optimizes the two factors using a projected gradient descent and an exact minimization step, respectively. Despite nonconvexity, we prove that \textsf{Alt-PGD-Min} achieves $\epsilon$-accuracy recovery with $\mathcal O\left( \kappa^2 \log \frac{1}{\epsilon}\right)$ iteration complexity and $\mathcal O\left( \kappa^6rn_3\log n_3 \left( \kappa^2r\left(n_1 + n_2 \right) + n_1 \log \frac{1}{\epsilon}\right) \right)$ sample complexity, where $\kappa$ denotes tensor condition number of $\boldsymbol{\mathcal X}^\star$. To further accelerate the convergence, especially when the tensor is ill-conditioned with large $\kappa$, we prove \textsf{Alt-ScalePGD-Min} that preconditions the gradient update using an approximate Hessian that can be computed efficiently. We show that \textsf{Alt-ScalePGD-Min} achieves $\kappa$ independent iteration complexity $\mathcal O(\log \frac{1}{\epsilon})$ and improves the sample complexity to $\mathcal O\left( \kappa^4 rn_3 \log n_3 \left( \kappa^4r(n_1+n_2) + n_1 \log \frac{1}{\epsilon}\right) \right)$. Experiments validate the effectiveness of the proposed methods.
Coseparable Nonnegative Tensor Factorization With T-CUR Decomposition
Chen, Juefei, Huang, Longxiu, Wei, Yimin
Nonnegative Matrix Factorization (NMF) is an important unsupervised learning method to extract meaningful features from data. To address the NMF problem within a polynomial time framework, researchers have introduced a separability assumption, which has recently evolved into the concept of coseparability. This advancement offers a more efficient core representation for the original data. However, in the real world, the data is more natural to be represented as a multi-dimensional array, such as images or videos. The NMF's application to high-dimensional data involves vectorization, which risks losing essential multi-dimensional correlations. To retain these inherent correlations in the data, we turn to tensors (multidimensional arrays) and leverage the tensor t-product. This approach extends the coseparable NMF to the tensor setting, creating what we term coseparable Nonnegative Tensor Factorization (NTF). In this work, we provide an alternating index selection method to select the coseparable core. Furthermore, we validate the t-CUR sampling theory and integrate it with the tensor Discrete Empirical Interpolation Method (t-DEIM) to introduce an alternative, randomized index selection process. These methods have been tested on both synthetic and facial analysis datasets. The results demonstrate the efficiency of coseparable NTF when compared to coseparable NMF.
Efficiency of First-Order Methods for Low-Rank Tensor Recovery with the Tensor Nuclear Norm Under Strict Complementarity
We consider convex relaxations for recovering low-rank tensors based on constrained minimization over a ball induced by the tensor nuclear norm, recently introduced in \cite{tensor_tSVD}. We build on a recent line of results that considered convex relaxations for the recovery of low-rank matrices and established that under a strict complementarity condition (SC), both the convergence rate and per-iteration runtime of standard gradient methods may improve dramatically. We develop the appropriate strict complementarity condition for the tensor nuclear norm ball and obtain the following main results under this condition: 1. When the objective to minimize is of the form $f(\mX)=g(\mA\mX)+\langle{\mC,\mX}\rangle$ , where $g$ is strongly convex and $\mA$ is a linear map (e.g., least squares), a quadratic growth bound holds, which implies linear convergence rates for standard projected gradient methods, despite the fact that $f$ need not be strongly convex. 2. For a smooth objective function, when initialized in certain proximity of an optimal solution which satisfies SC, standard projected gradient methods only require SVD computations (for projecting onto the tensor nuclear norm ball) of rank that matches the tubal rank of the optimal solution. In particular, when the tubal rank is constant, this implies nearly linear (in the size of the tensor) runtime per iteration, as opposed to super linear without further assumptions. 3. For a nonsmooth objective function which admits a popular smooth saddle-point formulation, we derive similar results to the latter for the well known extragradient method. An additional contribution which may be of independent interest, is the rigorous extension of many basic results regarding tensors of arbitrary order, which were previously obtained only for third-order tensors.
Effective Streaming Low-tubal-rank Tensor Approximation via Frequent Directions
Yi, Qianxin, Wang, Chenhao, Wang, Kaidong, Wang, Yao
Low-tubal-rank tensor approximation has been proposed to analyze large-scale and multi-dimensional data. However, finding such an accurate approximation is challenging in the streaming setting, due to the limited computational resources. To alleviate this issue, this paper extends a popular matrix sketching technique, namely Frequent Directions, for constructing an efficient and accurate low-tubal-rank tensor approximation from streaming data based on the tensor Singular Value Decomposition (t-SVD). Specifically, the new algorithm allows the tensor data to be observed slice by slice, but only needs to maintain and incrementally update a much smaller sketch which could capture the principal information of the original tensor. The rigorous theoretical analysis shows that the approximation error of the new algorithm can be arbitrarily small when the sketch size grows linearly. Extensive experimental results on both synthetic and real multi-dimensional data further reveal the superiority of the proposed algorithm compared with other sketching algorithms for getting low-tubal-rank approximation, in terms of both efficiency and accuracy.
Convolutional Proximal Neural Networks and Plug-and-Play Algorithms
Hertrich, Johannes, Neumayer, Sebastian, Steidl, Gabriele
In this paper, we introduce convolutional proximal neural networks (cPNNs), which are by construction averaged operators. For filters of full length, we propose a stochastic gradient descent algorithm on a submanifold of the Stiefel manifold to train cPNNs. In case of filters with limited length, we design algorithms for minimizing functionals that approximate the orthogonality constraints imposed on the operators by penalizing the least squares distance to the identity operator. Then, we investigate how scaled cPNNs with a prescribed Lipschitz constant can be used for denoising signals and images, where the achieved quality depends on the Lipschitz constant. Finally, we apply cPNN based denoisers within a Plug-and-Play (PnP) framework and provide convergence results for the corresponding PnP forward-backward splitting algorithm based on an oracle construction.
Tensor-Tensor Product Toolbox
Tensors are higher-order extensions of matrices. In recent work [2], the authors introduced the notion of the t-product, a generalization of matrix multiplication for tensors of order three. The multiplication is based on a convolution-like operation, which can be implemented efficiently using the Fast Fourier Transform (FFT). Based on t-product, there has a similar linear algebraic structure of tensors to matrices. For example, there has the tensor SVD (t-SVD) which is computable. By using some properties of FFT, we have a more efficient way for computing t-product and t-SVD in [4]. We develop a Matlab toolbox to implement several basic operations on tensors based on t-product.
Tensor Robust Principal Component Analysis with A New Tensor Nuclear Norm
Lu, Canyi, Feng, Jiashi, Chen, Yudong, Liu, Wei, Lin, Zhouchen, Yan, Shuicheng
In this paper, we consider the Tensor Robust Principal Component Analysis (TRPCA) problem, which aims to exactly recover the low-rank and sparse components from their sum. Our model is based on the recently proposed tensor-tensor product (or t-product) [13]. Induced by the t-product, we first rigorously deduce the tensor spectral norm, tensor nuclear norm, and tensor average rank, and show that the tensor nuclear norm is the convex envelope of the tensor average rank within the unit ball of the tensor spectral norm. These definitions, their relationships and properties are consistent with matrix cases. Equipped with the new tensor nuclear norm, we then solve the TRPCA problem by solving a convex program and provide the theoretical guarantee for the exact recovery. Our TRPCA model and recovery guarantee include matrix RPCA as a special case. Numerical experiments verify our results, and the applications to image recovery and background modeling problems demonstrate the effectiveness of our method.
Clustering multi-way data: a novel algebraic approach
Kernfeld, Eric, Aeron, Shuchin, Kilmer, Misha
In this paper, we develop a method for unsupervised clustering of two-way (matrix) data by combining two recent innovations from different fields: the Sparse Subspace Clustering (SSC) algorithm [10], which groups points coming from a union of subspaces into their respective subspaces, and the t-product [18], which was introduced to provide a matrix-like multiplication for third order tensors. Our algorithm is analogous to SSC in that an "affinity" between different data points is built using a sparse self-representation of the data. Unlike SSC, we employ the t-product in the self-representation. This allows us more flexibility in modeling; infact, SSC is a special case of our method. When using the t-product, three-way arrays are treated as matrices whose elements (scalars) are n-tuples or tubes. Convolutions take the place of scalar multiplication. This framework allows us to embed the 2-D data into a vector-space-like structure called a free module over a commutative ring. These free modules retain many properties of complex inner-product spaces, and we leverage that to provide theoretical guarantees on our algorithm. We show that compared to vector-space counterparts, SSmC achieves higher accuracy and better able to cluster data with less preprocessing in some image clustering problems. In particular we show the performance of the proposed method on Weizmann face database, the Extended Yale B Face database and the MNIST handwritten digits database.