Goto

Collaborating Authors

 diagonalization


A posteriori error bounds for joint matrix decomposition problems

Neural Information Processing Systems

Joint matrix triangularization is often used for estimating the joint eigenstructure of a set M of matrices, with applications in signal processing and machine learning. We consider the problem of approximate joint matrix triangularization when the matrices in M are jointly diagonalizable and real, but we only observe a set M' of noise perturbed versions of the matrices in M. Our main result is a first-order upper bound on the distance between any approximate joint triangularizer of the matrices in M' and any exact joint triangularizer of the matrices in M. The bound depends only on the observable matrices in M' and the noise level. In particular, it does not depend on optimization specific properties of the triangularizer, such as its proximity to critical points, that are typical of existing bounds in the literature. To our knowledge, this is the first a posteriori bound for joint matrix decomposition. We demonstrate the bound on synthetic data for which the ground truth is known.


Shallow-circuit Supervised Learning on a Quantum Processor

arXiv.org Machine Learning

Quantum computing has long promised transformative advances in data analysis, yet practical quantum machine learning has remained elusive due to fundamental obstacles such as a steep quantum cost for the loading of classical data and poor trainability of many quantum machine learning algorithms designed for near-term quantum hardware. In this work, we show that one can overcome these obstacles by using a linear Hamiltonian-based machine learning method which provides a compact quantum representation of classical data via ground state problems for k-local Hamiltonians. We use the recent sample-based Krylov quantum diagonalization method to compute low-energy states of the data Hamiltonians, whose parameters are trained to express classical datasets through local gradients. We demonstrate the efficacy and scalability of the methods by performing experiments on benchmark datasets using up to 50 qubits of an IBM Heron quantum processor.


Approximating the Total Variation Distance between Gaussians

arXiv.org Artificial Intelligence

The total variation distance is a metric of central importance in statistics and probability theory. However, somewhat surprisingly, questions about computing it algorithmically appear not to have been systematically studied until very recently. In this paper, we contribute to this line of work by studying this question in the important special case of multivariate Gaussians. More formally, we consider the problem of approximating the total variation distance between two multivariate Gaussians to within an $\epsilon$-relative error. Previous works achieved a fixed constant relative error approximation via closed-form formulas. In this work, we give algorithms that given any two $n$-dimensional Gaussians $D_1,D_2$, and any error bound $\epsilon > 0$, approximate the total variation distance $D := d_{TV}(D_1,D_2)$ to $\epsilon$-relative accuracy in $\text{poly}(n,\frac{1}{\epsilon},\log \frac{1}{D})$ operations. The main technical tool in our work is a reduction that helps us extend the recent progress on computing the TV-distance between discrete random variables to our continuous setting.


Identification and estimation for matrix time series CP-factor models

arXiv.org Artificial Intelligence

The demand of modeling and forecasting high-dimensional matrix time series brings the opportunities with challenges. Let Y t " p y i,j,tq be a p ˆ q matrix recorded at time t, where y i,j,trepresents the value of, for example, the j -th variable on the i -th individual at time t . A popular approach to model Y t in the existing literature is via the so-called Tucker decomposition, namely the matrix Tucker-factor model. See, for example, Wang et al. (2019), Chen et al. (2020) and Chen and Chen (2022). It represents a high-dimensional matrix time series as a linear combination of a lower-dimensional matrix process. The Tucker decomposition can be viewed as a natural extension of the factor model for vector time series considered in Lam and Yao (2012) and Chang et al. (2015). Similarly we can only identify the factor loading spaces (the linear spaces spanned by the columns of the factor loading matrices) in the matrix Tucker-factor model while the factor loading matrices themselves are not uniquely defined. Parallel to the Tucker decomposition for matrix time series Y t, Chang et al. (2023) and Han et al. (2024) consider to model Y t via the so-called canonical polyadic (CP) decomposition, namely the matrix CP-factor model. It provides a more comprehensive dimensionality reduction as the dynamic structure of a matrix time series is driven by a vector process rather than a matrix process. Furthermore the factor loading matrices in the matrix CP-factor model can be identified uniquely upto the column reflection and permutation indeterminacy under some regularity conditions. The CP-factor model for matrix time series specified in Chang et al. (2023) admits the form Y t " AX tB J ` ε t, t ě 1, (1) where X t " diag p x tq with x t " p x t, 1, .. .


Towards Faster Matrix Diagonalization with Graph Isomorphism Networks and the AlphaZero Framework

arXiv.org Artificial Intelligence

In this paper, we introduce innovative approaches for accelerating the Jacobi method for matrix diagonalization, specifically through the formulation of large matrix diagonalization as a Semi-Markov Decision Process and small matrix diagonalization as a Markov Decision Process. Furthermore, we examine the potential of utilizing scalable architecture between different-sized matrices. During a short training period, our method discovered a significant reduction in the number of steps required for diagonalization and exhibited efficient inference capabilities. Importantly, this approach demonstrated possible scalability to large-sized matrices, indicating its potential for wide-ranging applicability. Upon training completion, we obtain action-state probabilities and transition graphs, which depict transitions between different states. These outputs not only provide insights into the diagonalization process but also pave the way for cost savings pertinent to large-scale matrices. The advancements made in this research enhance the efficacy and scalability of matrix diagonalization, pushing for new possibilities for deployment in practical applications in scientific and engineering domains.


Accelerating Matrix Diagonalization through Decision Transformers with Epsilon-Greedy Optimization

arXiv.org Artificial Intelligence

This paper introduces a novel framework for matrix diagonalization, recasting it as a sequential decision-making problem and applying the power of Decision Transformers (DTs). Our approach determines optimal pivot selection during diagonalization with the Jacobi algorithm, leading to significant speedups compared to the traditional max-element Jacobi method. To bolster robustness, we integrate an epsilon-greedy strategy, enabling success in scenarios where deterministic approaches fail. This work demonstrates the effectiveness of DTs in complex computational tasks and highlights the potential of reimagining mathematical operations through a machine learning lens. Furthermore, we establish the generalizability of our method by using transfer learning to diagonalize matrices of smaller sizes than those trained.


A Posteriori Error Bounds for Joint Matrix Decomposition Problems

Neural Information Processing Systems

Joint matrix triangularization is often used for estimating the joint eigenstructure of a set M of matrices, with applications in signal processing and machine learning. We consider the problem of approximate joint matrix triangularization when the matrices in M are jointly diagonalizable and real, but we only observe a set M' of noise perturbed versions of the matrices in M. Our main result is a first-order upper bound on the distance between any approximate joint triangularizer of the matrices in M' and any exact joint triangularizer of the matrices in M. The bound depends only on the observable matrices in M' and the noise level. In particular, it does not depend on optimization specific properties of the triangularizer, such as its proximity to critical points, that are typical of existing bounds in the literature. To our knowledge, this is the first a posteriori bound for joint matrix decomposition. We demonstrate the bound on synthetic data for which the ground truth is known.


Gaussian-process-regression-based method for the localization of exceptional points in complex resonance spectra

arXiv.org Artificial Intelligence

Resonances in open quantum systems depending on at least two controllable parameters can show the phenomenon of exceptional points (EPs), where not only the eigenvalues but also the eigenvectors of two or more resonances coalesce. Their exact localization in the parameter space is challenging, in particular in systems, where the computation of the quantum spectra and resonances is numerically very expensive. We introduce an efficient machine learning algorithm to find exceptional points based on Gaussian process regression (GPR). The GPR-model is trained with an initial set of eigenvalue pairs belonging to an EP and used for a first estimation of the EP position via a numerically cheap root search. The estimate is then improved iteratively by adding selected exact eigenvalue pairs as training points to the GPR-model. The GPR-based method is developed and tested on a simple low-dimensional matrix model and then applied to a challenging real physical system, viz., the localization of EPs in the resonance spectra of excitons in cuprous oxide in external electric and magnetic fields. The precise computation of EPs, by taking into account the complete valence band structure and central-cell corrections of the crystal, can be the basis for the experimental observation of EPs in this system.


On the use of the Gram matrix for multivariate functional principal components analysis

arXiv.org Machine Learning

Dimension reduction is crucial in functional data analysis (FDA). The key tool to reduce the dimension of the data is functional principal component analysis. Existing approaches for functional principal component analysis usually involve the diagonalization of the covariance operator. With the increasing size and complexity of functional datasets, estimating the covariance operator has become more challenging. Therefore, there is a growing need for efficient methodologies to estimate the eigencomponents. Using the duality of the space of observations and the space of functional features, we propose to use the inner-product between the curves to estimate the eigenelements of multivariate and multidimensional functional datasets. The relationship between the eigenelements of the covariance operator and those of the inner-product matrix is established. We explore the application of these methodologies in several FDA settings and provide general guidance on their usability.


Enhancing variational quantum state diagonalization using reinforcement learning techniques

arXiv.org Artificial Intelligence

The development of variational quantum algorithms is crucial for the application of NISQ computers. Such algorithms require short quantum circuits, which are more amenable to implementation on near-term hardware, and many such methods have been developed. One of particular interest is the so-called the variational diagonalization method, which constitutes an important algorithmic subroutine, and it can be used directly for working with data encoded in quantum states. In particular, it can be applied to discern the features of quantum states, such as entanglement properties of a system, or in quantum machine learning algorithms. In this work, we tackle the problem of designing a very shallow quantum circuit, required in the quantum state diagonalization task, by utilizing reinforcement learning. To achieve this, we utilize a novel encoding method that can be used to tackle the problem of circuit depth optimization using a reinforcement learning approach. We demonstrate that our approach provides a solid approximation to the diagonalization task while using a small number of gates. The circuits proposed by the reinforcement learning methods are shallower than the standard variational quantum state diagonalization algorithm, and thus can be used in situations where the depth of quantum circuits is limited by the hardware capabilities.