randomized svd
Reviews: Scalable Adaptive Stochastic Optimization Using Random Projections
Despite its usefulness, the proposed method is a bit trivial. Low-rank approximation and using randomized SVD for low-rank approximation are well known techniques. Besides, I have two minor comments: 1. Why the randomized algorithm can improve performance (lower objective value and higher recognition rate)? See Figure 1(a) and Figure 1(a)-(c).
Efficient error and variance estimation for randomized matrix computations
Epperly, Ethan N., Tropp, Joel A.
Randomized matrix algorithms have become workhorse tools in scientific computing and machine learning. To use these algorithms safely in applications, they should be coupled with posterior error estimates to assess the quality of the output. To meet this need, this paper proposes two diagnostics: a leave-one-out error estimator for randomized low-rank approximations and a jackknife resampling method to estimate the variance of the output of a randomized matrix computation. Both of these diagnostics are rapid to compute for randomized low-rank approximation algorithms such as the randomized SVD and Nystr\"om, and they provide useful information that can be used to assess the quality of the computed output and guide algorithmic parameter choices.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Los Angeles County > Pasadena (0.04)
Data-driven discovery of Green's functions
Discovering hidden partial differential equations (PDEs) and operators from data is an important topic at the frontier between machine learning and numerical analysis. This doctoral thesis introduces theoretical results and deep learning algorithms to learn Green's functions associated with linear partial differential equations and rigorously justify PDE learning techniques. A theoretically rigorous algorithm is derived to obtain a learning rate, which characterizes the amount of training data needed to approximately learn Green's functions associated with elliptic PDEs. The construction connects the fields of PDE learning and numerical linear algebra by extending the randomized singular value decomposition to non-standard Gaussian vectors and Hilbert--Schmidt operators, and exploiting the low-rank hierarchical structure of Green's functions using hierarchical matrices. Rational neural networks (NNs) are introduced and consist of neural networks with trainable rational activation functions. The highly compositional structure of these networks, combined with rational approximation theory, implies that rational functions have higher approximation power than standard activation functions. In addition, rational NNs may have poles and take arbitrarily large values, which is ideal for approximating functions with singularities such as Green's functions. Finally, theoretical results on Green's functions and rational NNs are combined to design a human-understandable deep learning method for discovering Green's functions from data. This approach complements state-of-the-art PDE learning techniques, as a wide range of physics can be captured from the learned Green's functions such as dominant modes, symmetries, and singularity locations.
Learning Green's functions associated with time-dependent partial differential equations
Boullé, Nicolas, Kim, Seick, Shi, Tianyi, Townsend, Alex
Neural operators are a popular technique in scientific machine learning to learn a mathematical model of the behavior of unknown physical systems from data. Neural operators are especially useful to learn solution operators associated with partial differential equations (PDEs) from pairs of forcing functions and solutions when numerical solvers are not available or the underlying physics is poorly understood. In this work, we attempt to provide theoretical foundations to understand the amount of training data needed to learn time-dependent PDEs. Given input-output pairs from a parabolic PDE in any spatial dimension $n\geq 1$, we derive the first theoretically rigorous scheme for learning the associated solution operator, which takes the form of a convolution with a Green's function $G$. Until now, rigorously learning Green's functions associated with time-dependent PDEs has been a major challenge in the field of scientific machine learning because $G$ may not be square-integrable when $n>1$, and time-dependent PDEs have transient dynamics. By combining the hierarchical low-rank structure of $G$ together with randomized numerical linear algebra, we construct an approximant to $G$ that achieves a relative error of $\smash{\mathcal{O}(\Gamma_\epsilon^{-1/2}\epsilon)}$ in the $L^1$-norm with high probability by using at most $\smash{\mathcal{O}(\epsilon^{-\frac{n+2}{2}}\log(1/\epsilon))}$ input-output training pairs, where $\Gamma_\epsilon$ is a measure of the quality of the training dataset for learning $G$, and $\epsilon>0$ is sufficiently small.
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > United States > Indiana (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
Perturbation Analysis of Randomized SVD and its Applications to High-dimensional Statistics
Randomized singular value decomposition (RSVD) is a class of computationally efficient algorithms for computing the truncated SVD of large data matrices. Given a $n \times n$ symmetric matrix $\mathbf{M}$, the prototypical RSVD algorithm outputs an approximation of the $k$ leading singular vectors of $\mathbf{M}$ by computing the SVD of $\mathbf{M}^{g} \mathbf{G}$; here $g \geq 1$ is an integer and $\mathbf{G} \in \mathbb{R}^{n \times k}$ is a random Gaussian sketching matrix. In this paper we study the statistical properties of RSVD under a general "signal-plus-noise" framework, i.e., the observed matrix $\hat{\mathbf{M}}$ is assumed to be an additive perturbation of some true but unknown signal matrix $\mathbf{M}$. We first derive upper bounds for the $\ell_2$ (spectral norm) and $\ell_{2\to\infty}$ (maximum row-wise $\ell_2$ norm) distances between the approximate singular vectors of $\hat{\mathbf{M}}$ and the true singular vectors of the signal matrix $\mathbf{M}$. These upper bounds depend on the signal-to-noise ratio (SNR) and the number of power iterations $g$. A phase transition phenomenon is observed in which a smaller SNR requires larger values of $g$ to guarantee convergence of the $\ell_2$ and $\ell_{2\to\infty}$ distances. We also show that the thresholds for $g$ where these phase transitions occur are sharp whenever the noise matrices satisfy a certain trace growth condition. Finally, we derive normal approximations for the row-wise fluctuations of the approximate singular vectors and the entrywise fluctuations of the approximate matrix. We illustrate our theoretical results by deriving nearly-optimal performance guarantees for RSVD when applied to three statistical inference problems, namely, community detection, matrix completion, and principal component analysis with missing data.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > North Carolina (0.04)
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
Light on Math ML: Intuitive guide to matrix factorization
That one would make more sense. In a video feed, a static background would contain most data (data, not information) in terms of the volume. The remaining sparse information belongs to foreground -- because foreground is typically taking small space in the video. Therefore, if I try to force M to become a sparse matrix S, I'd probably capture foreground movements in S. Another way to think is that, S captures the outliers in your data! Now adding up L and S should give us the original video, which is what the robust PCA equation is saying. Now easier said than done! How do we actually ensure these properties for these matrices.
A generalization of the randomized singular value decomposition
Boullé, Nicolas, Townsend, Alex
The randomized singular value decomposition (SVD) is a popular and effective algorithm for computing a near-best rank $k$ approximation of a matrix $A$ using matrix-vector products with standard Gaussian vectors. Here, we generalize the theory of randomized SVD to multivariable Gaussian vectors, allowing one to incorporate prior knowledge of $A$ into the algorithm. This enables us to explore the continuous analogue of the randomized SVD for Hilbert--Schmidt (HS) operators using operator-function products with functions drawn from a Gaussian process (GP). We then construct a new covariance kernel for GPs, based on weighted Jacobi polynomials, which allows us to rapidly sample the GP and control the smoothness of the randomly generated functions. Numerical examples on matrices and HS operators demonstrate the applicability of the algorithm.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Block Randomized Optimization for Adaptive Hypergraph Learning
Karantaidis, Georgios, Sarridis, Ioannis, Kotropoulos, Constantine
The high-order relations between the content in social media sharing platforms are frequently modeled by a hypergraph. Either hypergraph Laplacian matrix or the adjacency matrix is a big matrix. Randomized algorithms are used for low-rank factorizations in order to approximately decompose and eventually invert such big matrices fast. Here, block randomized Singular Value Decomposition (SVD) via subspace iteration is integrated within adaptive hypergraph weight estimation for image tagging, as a first approach. Specifically, creating low-rank submatrices along the main diagonal by tessellation permits fast matrix inversions via randomized SVD. Moreover, a second approach is proposed for solving the linear system in the optimization problem of hypergraph learning by employing the conjugate gradient method. Both proposed approaches achieve high accuracy in image tagging measured by F1 score and succeed to reduce the computational requirements of adaptive hypergraph weight estimation.
Faster Matrix Completion Using Randomized SVD
Feng, Xu, Yu, Wenjian, Li, Yaohang
Abstract--Matrix completion is a widely used technique for image inpainting and personalized recommender system, etc. In this work, we focus on accelerating the matrix completion using faster randomized singular value decomposition (rSVD). Firstly, two fast randomized algorithms (rSVD-PI and rSVD-BKI) are proposed for handling sparse matrix. They make use of an eigSVD procedure and several accelerating skills. Then, with the rSVD-BKI algorithm and a new subspace recycling technique, we accelerate the singular value thresholding (SVT) method in [1] to realize faster matrix completion. Experiments show that the proposed rSVD algorithms can be 6X faster than the basic rSVD algorithm [2] while keeping same accuracy. The problem of matrix completion, or estimating missing values in a matrix, occurs in many areas of engineering and applied science such as computer vision, pattern recognition and machine learning [1], [3], [4]. For example, in computer vision and image processing problems, recovering the missing or corrupted data can be regarded as matrix completion. A recommender system provides recommendations based on the user's preferences, which are often inferred with some ratings submitted by users. This is another scenario where the matrix completion can be applied. The matrix which we wish to complete often has low rank or approximately low rank.
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Virginia > Norfolk City County > Norfolk (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)