Goto

Collaborating Authors

 submatrix




Andrei Chertkov Roman Schutski Anh-Huy Phan

Neural Information Processing Systems

For the given tolerance threshold ϵ ( ϵ 1 and close to one), we find the set I as follows: 1. The rect_maxvol algorithm can be used in this case. The following rect_maxvol algorithm will find additional R rows: 1. In Algorithm A.1 we present the details of the TTOpt implementation. The procedure of updating the TT proxy is outlined in Algorithm A.2 (function update_left, that updates core tensors of the Algorithm A.2: Function update_left to update points of interest when traverse the tensor modes The arguments for target function evaluation in Algorithm A.4 are selected as merged left and right In Section 3.1 we compared the TTOpt solver The list of functions is presented in Table 1.



Evaluating the statistical significance of biclusters

Jason D. Lee, Yuekai Sun, Jonathan E. Taylor

Neural Information Processing Systems

Biclustering (also known as submatrix localization) is a problem of high practical relevance in exploratory analysis of high-dimensional data. We develop a framework for performing statistical inference on biclusters found by score-based algorithms. Since the bicluster was selected in a data dependent manner by a biclustering or localization algorithm, this is a form of selective inference . Our framework gives exact (non-asymptotic) confidence intervals and p-values for the significance of the selected biclusters.




Spectral Estimation with Free Decompression

Ameli, Siavash, van der Heide, Chris, Hodgkinson, Liam, Mahoney, Michael W.

arXiv.org Machine Learning

Computing eigenvalues of very large matrices is a critical task in many machine learning applications, including the evaluation of log-determinants, the trace of matrix functions, and other important metrics. As datasets continue to grow in scale, the corresponding covariance and kernel matrices become increasingly large, often reaching magnitudes that make their direct formation impractical or impossible. Existing techniques typically rely on matrix-vector products, which can provide efficient approximations, if the matrix spectrum behaves well. However, in settings like distributed learning, or when the matrix is defined only indirectly, access to the full data set can be restricted to only very small sub-matrices of the original matrix. In these cases, the matrix of nominal interest is not even available as an implicit operator, meaning that even matrix-vector products may not be available. In such settings, the matrix is "impalpable," in the sense that we have access to only masked snapshots of it. We draw on principles from free probability theory to introduce a novel method of "free decompression" to estimate the spectrum of such matrices. Our method can be used to extrapolate from the empirical spectral densities of small submatrices to infer the eigenspectrum of extremely large (impalpable) matrices (that we cannot form or even evaluate with full matrix-vector products). We demonstrate the effectiveness of this approach through a series of examples, comparing its performance against known limiting distributions from random matrix theory in synthetic settings, as well as applying it to submatrices of real-world datasets, matching them with their full empirical eigenspectra.


Cross-Channel Unlabeled Sensing over a Union of Signal Subspaces

Koka, Taulant, Tsakiris, Manolis C., Haro, Benjamín Béjar, Muma, Michael

arXiv.org Artificial Intelligence

Cross-channel unlabeled sensing addresses the problem of recovering a multi-channel signal from measurements that were shuffled across channels. This work expands the cross-channel unlabeled sensing framework to signals that lie in a union of subspaces. The extension allows for handling more complex signal structures and broadens the framework to tasks like compressed sensing. These mismatches between samples and channels often arise in applications such as whole-brain calcium imaging of freely moving organisms or multi-target tracking. We improve over previous models by deriving tighter bounds on the required number of samples for unique reconstruction, while supporting more general signal types. The approach is validated through an application in whole-brain calcium imaging, where organism movements disrupt sample-to-neuron mappings. This demonstrates the utility of our framework in real-world settings with imprecise sample-channel associations, achieving accurate signal reconstruction.


Constructing Stochastic Matrices for Weighted Averaging in Gossip Networks

Bayram, Erkan, Belabbas, Mohamed-Ali

arXiv.org Artificial Intelligence

The convergence of the gossip process has been extensively studied; however, algorithms that generate a set of stochastic matrices, the infinite product of which converges to a rank-one matrix determined by a given weight vector, have been less explored. In this work, we propose an algorithm for constructing (local) stochastic matrices based on a given gossip network topology and a set of weights for averaging across different consensus clusters, ensuring that the gossip process converges to a finite limit set.