eigenproblem
Entrywise convergence of iterative methods for eigenproblems
Several problems in machine learning, statistics, and other fields rely on computing eigenvectors. For large scale problems, the computation of these eigenvectors is typically performed via iterative schemes such as subspace iteration or Krylov methods. While there is classical and comprehensive analysis for subspace convergence guarantees with respect to the spectral norm, in many modern applications other notions of subspace distance are more appropriate. Recent theoretical work has focused on perturbations of subspaces measured in the ℓ2 norm, but does not consider the actual computation of eigenvectors. Here we address the convergence of subspace iteration when distances are measured in the ℓ2 norm and provide deterministic bounds. We complement our analysis with a practical stopping criterion and demonstrate its applicability via numerical experiments. Our results show that one can get comparable performance on downstream tasks while requiring fewer iterations, thereby saving substantial computational time.
Deep Learning for Subspace Regression
Fanaskov, Vladimir, Trifonov, Vladislav, Rudikov, Alexander, Muravleva, Ekaterina, Oseledets, Ivan
It is often possible to perform reduced order modelling by specifying linear subspace which accurately captures the dynamics of the system. This approach becomes especially appealing when linear subspace explicitly depends on parameters of the problem. A practical way to apply such a scheme is to compute subspaces for a selected set of parameters in the computationally demanding offline stage and in the online stage approximate subspace for unknown parameters by interpolation. For realistic problems the space of parameters is high dimensional, which renders classical interpolation strategies infeasible or unreliable. We propose to relax the interpolation problem to regression, introduce several loss functions suitable for subspace data, and use a neural network as an approximation to high-dimensional target function. To further simplify a learning problem we introduce redundancy: in place of predicting subspace of a given dimension we predict larger subspace. We show theoretically that this strategy decreases the complexity of the mapping for elliptic eigenproblems with constant coefficients and makes the mapping smoother for general smooth function on the Grassmann manifold. Empirical results also show that accuracy significantly improves when larger-than-needed subspaces are predicted. With the set of numerical illustrations we demonstrate that subspace regression can be useful for a range of tasks including parametric eigenproblems, deflation techniques, relaxation methods, optimal control and solution of parametric partial differential equations.
4e8eaf897c638d519710b1691121f8cb-Supplemental.pdf
Supplementary material for'Locality defeats the curse of dimensionality in convolutional teacher-student scenarios' In this appendix we provide additional details about the derivation of Eq. (8) within the framework's are free to take any value. Finally, Eq. (8) is obtained by noticing that, under our assumptions on the decay of If all the parameters are initialised independently from a standard Normal distribution, Eq. (13) is In this section we prove the eigendecompositions introduced in Lemma 3.3 and Lemma 3.4, then extend them to overlapping-patches kernel (cf. We start by proving orthonormality of the eigenfunctions. Then, we prove that the eigenfunctions and the eigenvalues defined in Eq. (17) satisfy the kernel We start again by proving the orthonormality of the eigenfunctions. Then, we prove that the eigenfunctions and the eigenvalues defined in Eq. (19) satisfy the kernel In this section Lemma 3.3 and Lemma 3.4 are extended to kernels with overlapping patches, having (u 1) 's, we have introduced an apex We start by proving the orthonormality of the eigenfunctions.
Entrywise convergence of iterative methods for eigenproblems
Several problems in machine learning, statistics, and other fields rely on computing eigenvectors. For large scale problems, the computation of these eigenvectors is typically performed via iterative schemes such as subspace iteration or Krylov methods. While there is classical and comprehensive analysis for subspace convergence guarantees with respect to the spectral norm, in many modern applications other notions of subspace distance are more appropriate. Recent theoretical work has focused on perturbations of subspaces measured in the ℓ2 norm, but does not consider the actual computation of eigenvectors. Here we address the convergence of subspace iteration when distances are measured in the ℓ2 norm and provide deterministic bounds.
Entrywise convergence of iterative methods for eigenproblems
Several problems in machine learning, statistics, and other fields rely on computing eigenvectors. For large scale problems, the computation of these eigenvectors is typically performed via iterative schemes such as subspace iteration or Krylov methods. While there is classical and comprehensive analysis for subspace convergence guarantees with respect to the spectral norm, in many modern applications other notions of subspace distance are more appropriate. Recent theoretical work has focused on perturbations of subspaces measured in the ℓ2 norm, but does not consider the actual computation of eigenvectors. Here we address the convergence of subspace iteration when distances are measured in the ℓ2 norm and provide deterministic bounds.
On Quantum Channel Learning
Belov, Mikhail Gennadievich, Dubov, Victor Victorovich, Filimonov, Alexey Vladimirovich, Malyshkin, Vladislav Gennadievich
The problem of an optimal mapping between Hilbert spaces $IN$ and $OUT$, based on a series of density matrix mapping measurements $\rho^{(l)} \to \varrho^{(l)}$, $l=1\dots M$, is formulated as an optimization problem maximizing the total fidelity $\mathcal{F}=\sum_{l=1}^{M} \omega^{(l)} F\left(\varrho^{(l)},\sum_s B_s \rho^{(l)} B^{\dagger}_s\right)$ subject to probability preservation constraints on Kraus operators $B_s$. For $F(\varrho,\sigma)$ in the form that total fidelity can be represented as a quadratic form with superoperator $\mathcal{F}=\sum_s\left\langle B_s\middle|S\middle| B_s \right\rangle$ (either exactly or as an approximation) an iterative algorithm is developed to find the global maximum. The result comprises in $N_s$ operators $B_s$ that collectively form an $IN$ to $OUT$ quantum channel $A^{OUT}=\sum_s B_s A^{IN} B_s^{\dagger}$. The work introduces two important generalizations of unitary learning: 1. $IN$/$OUT$ states are represented as density matrices. 2. The mapping itself is formulated as a general quantum channel. This marks a crucial advancement from the commonly studied unitary mapping of pure states $\phi_l=\mathcal{U} \psi_l$ to a general quantum channel, what allows us to distinguish probabilistic mixture of states and their superposition. An application of the approach is demonstrated on unitary learning of density matrix mapping $\varrho^{(l)}=\mathcal{U} \rho^{(l)} \mathcal{U}^{\dagger}$, in this case a quadratic on $\mathcal{U}$ fidelity can be constructed by considering $\sqrt{\rho^{(l)}} \to \sqrt{\varrho^{(l)}}$ mapping, and on a general quantum channel of Kraus rank $N_s$, where quadratic on $B_s$ fidelity is an approximation -- a quantum channel is then built as a hierarchy of unitary mappings. The approach can be applied to study decoherence effects, spontaneous coherence, synchronizing, etc.
- Asia > Russia (0.14)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
On Partially Unitary Learning
Belov, Mikhail Gennadievich, Malyshkin, Vladislav Gennadievich
The problem of an optimal mapping between Hilbert spaces $IN$ of $\left|\psi\right\rangle$ and $OUT$ of $\left|\phi\right\rangle$ based on a set of wavefunction measurements (within a phase) $\psi_l \to \phi_l$, $l=1\dots M$, is formulated as an optimization problem maximizing the total fidelity $\sum_{l=1}^{M} \omega^{(l)} \left|\langle\phi_l|\mathcal{U}|\psi_l\rangle\right|^2$ subject to probability preservation constraints on $\mathcal{U}$ (partial unitarity). Constructed operator $\mathcal{U}$ can be considered as a $IN$ to $OUT$ quantum channel; it is a partially unitary rectangular matrix of the dimension $\dim(OUT) \times \dim(IN)$ transforming operators as $A^{OUT}=\mathcal{U} A^{IN} \mathcal{U}^{\dagger}$. An iteration algorithm finding the global maximum of this optimization problem is developed and it's application to a number of problems is demonstrated. A software product implementing the algorithm is available from the authors.
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
- Asia > Russia (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
Machine Learning Discovery of Optimal Quadrature Rules for Isogeometric Analysis
Teijeiro, Tomas, Taylor, Jamie M., Hashemian, Ali, Pardo, David
We propose the use of machine learning techniques to find optimal quadrature rules for the construction of stiffness and mass matrices in isogeometric analysis (IGA). We initially consider 1D spline spaces of arbitrary degree spanned over uniform and non-uniform knot sequences, and then the generated optimal rules are used for integration over higher-dimensional spaces using tensor product sense. The quadrature rule search is posed as an optimization problem and solved by a machine learning strategy based on gradient-descent. However, since the optimization space is highly non-convex, the success of the search strongly depends on the number of quadrature points and the parameter initialization. Thus, we use a dynamic programming strategy that initializes the parameters from the optimal solution over the spline space with a lower number of knots. With this method, we found optimal quadrature rules for spline spaces when using IGA discretizations with up to 50 uniform elements and polynomial degrees up to 8, showing the generality of the approach in this scenario. For non-uniform partitions, the method also finds an optimal rule in a reasonable number of test cases. We also assess the generated optimal rules in two practical case studies, namely, the eigenvalue problem of the Laplace operator and the eigenfrequency analysis of freeform curved beams, where the latter problem shows the applicability of the method to curved geometries. In particular, the proposed method results in savings with respect to traditional Gaussian integration of up to 44% in 1D, 68% in 2D, and 82% in 3D spaces.
- Europe > Latvia > Riga Municipality > Riga (0.05)
- Europe > Spain > Basque Country > Biscay Province > Bilbao (0.04)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
- (3 more...)
Let's consider more general nonlinear approaches to study teleconnections of climate variables
Bueso, D., Piles, M., Camps-Valls, G.
The recent work by (Rieger et al 2021) is concerned with the problem of extracting features from spatio-temporal geophysical signals. The authors introduce the complex rotated MCA (xMCA) to deal with lagged effects and non-orthogonality of the feature representation. This method essentially (1) transforms the signals to a complex plane with the Hilbert transform; (2) applies an oblique (Varimax and Promax) rotation to remove the orthogonality constraint; and (3) performs the eigendecomposition in this complex space (Horel et al, 1984). We argue that this method is essentially a particular case of the method called rotated complex kernel principal component analysis (ROCK-PCA) introduced in (Bueso et al., 2019, 2020), where we proposed the same approach: first transform the data to the complex plane with the Hilbert transform and then apply the varimax rotation, with the only difference that the eigendecomposition is performed in the dual (kernel) Hilbert space. The latter allows us to generalize the xMCA solution by extracting nonlinear (curvilinear) features when nonlinear kernel functions are used. Hence, the solution of xMCA boils down to ROCK-PCA when the inner product is computed in the input data space instead of in the high-dimensional (possibly infinite) kernel Hilbert space to which data has been mapped. In this short correspondence we show theoretical proof that xMCA is a special case of ROCK-PCA and provide quantitative evidence that more expressive and informative features can be extracted when working with kernels; results of the decomposition of global sea surface temperature (SST) fields are shown to illustrate the capabilities of ROCK-PCA to cope with nonlinear processes, unlike xMCA.
- North America > United States > New Jersey (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
- Asia > Middle East > Jordan (0.04)