AITopics | submatrix

We note that these results are about two of the most commonly used architecture modifications for RNNs. First, the gating mechanism is ubiquitous in RNNs, and usually thought of as a heuristic for smoothing optimization [28]. Second, many of the effective large-scale RNNs use linear (gated) recurrences and deeper models, which is usually thought of as a heuristic for computational efficiency [5]. Our results suggest that neither of these are heuristics after all, and arise from standard ways to approximate ODEs. To be more specific, we show that: 19 Table 6: A summary of the characteristics of popular RNN methods and their approximation mechanisms for capturing the dynamics x(t) = x(t) + f(t,x(t)) (equation (14)). The LSSL entries are for the very specific case with order N = 1 and A= 1,B = 1,C = 1,D= 0; LSSLs are more general.

artificial intelligence, machine learning, matrix, (19 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

ea57fac4f3bdfbe98591f6f3acd3aae6-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 13:57:15 GMT

data mining, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Minnesota (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.92)

Industry: Information Technology > Services (0.45)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
(3 more...)

Add feedback

a730abbcd6cf4a371ca9545db5922442-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 04:54:13 GMT

algorithm, benchmark function, objective function, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Semi-Random Matrix Completion via Flow-Based Adaptive Reweighting Jonathan A. Kelner Jerry Li

Neural Information Processing SystemsOct-10-2025, 20:19:41 GMT

Since worst-case statistical inference problems are often intractable (i.e., without distributional

algorithm, matrix completion, probability 1, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Minnesota (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.92)

Industry: Information Technology > Services (0.45)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
(3 more...)

Add feedback

Evaluating the statistical significance of biclusters

Jason D. Lee, Yuekai Sun, Jonathan E. Taylor

Neural Information Processing SystemsOct-2-2025, 05:32:23 GMT

Biclustering (also known as submatrix localization) is a problem of high practical relevance in exploratory analysis of high-dimensional data. We develop a framework for performing statistical inference on biclusters found by score-based algorithms. Since the bicluster was selected in a data dependent manner by a biclustering or localization algorithm, this is a form of selective inference . Our framework gives exact (non-asymptotic) confidence intervals and p-values for the significance of the selected biclusters.

algorithm, selection event, submatrix, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.70)
Information Technology > Artificial Intelligence > Machine Learning (0.70)

Add feedback

a730abbcd6cf4a371ca9545db5922442-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-17-2025, 10:40:10 GMT

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Spectral Estimation with Free Decompression

Ameli, Siavash, van der Heide, Chris, Hodgkinson, Liam, Mahoney, Michael W.

arXiv.org Machine LearningJun-16-2025

Computing eigenvalues of very large matrices is a critical task in many machine learning applications, including the evaluation of log-determinants, the trace of matrix functions, and other important metrics. As datasets continue to grow in scale, the corresponding covariance and kernel matrices become increasingly large, often reaching magnitudes that make their direct formation impractical or impossible. Existing techniques typically rely on matrix-vector products, which can provide efficient approximations, if the matrix spectrum behaves well. However, in settings like distributed learning, or when the matrix is defined only indirectly, access to the full data set can be restricted to only very small sub-matrices of the original matrix. In these cases, the matrix of nominal interest is not even available as an implicit operator, meaning that even matrix-vector products may not be available. In such settings, the matrix is "impalpable," in the sense that we have access to only masked snapshots of it. We draw on principles from free probability theory to introduce a novel method of "free decompression" to estimate the spectrum of such matrices. Our method can be used to extrapolate from the empirical spectral densities of small submatrices to infer the eigenspectrum of extremely large (impalpable) matrices (that we cannot form or even evaluate with full matrix-vector products). We demonstrate the effectiveness of this approach through a series of examples, comparing its performance against known limiting distributions from random matrix theory in synthetic settings, as well as applying it to submatrices of real-world datasets, matching them with their full empirical eigenspectra.

artificial intelligence, machine learning, matrix, (18 more...)

arXiv.org Machine Learning

2506.11994

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (0.70)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Cross-Channel Unlabeled Sensing over a Union of Signal Subspaces

Koka, Taulant, Tsakiris, Manolis C., Haro, Benjamín Béjar, Muma, Michael

arXiv.org Artificial IntelligenceJun-12-2025

Cross-channel unlabeled sensing addresses the problem of recovering a multi-channel signal from measurements that were shuffled across channels. This work expands the cross-channel unlabeled sensing framework to signals that lie in a union of subspaces. The extension allows for handling more complex signal structures and broadens the framework to tasks like compressed sensing. These mismatches between samples and channels often arise in applications such as whole-brain calcium imaging of freely moving organisms or multi-target tracking. We improve over previous models by deriving tighter bounds on the required number of samples for unique reconstruction, while supporting more general signal types. The approach is validated through an application in whole-brain calcium imaging, where organism movements disrupt sample-to-neuron mappings. This demonstrates the utility of our framework in real-world settings with imprecise sample-channel associations, achieving accurate signal reconstruction.

artificial intelligence, machine learning, subspace, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICASSP49660.2025.10888212

2506.09773

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.96)

Add feedback

Constructing Stochastic Matrices for Weighted Averaging in Gossip Networks

Bayram, Erkan, Belabbas, Mohamed-Ali

arXiv.org Artificial IntelligenceFeb-27-2025

The convergence of the gossip process has been extensively studied; however, algorithms that generate a set of stochastic matrices, the infinite product of which converges to a rank-one matrix determined by a given weight vector, have been less explored. In this work, we propose an algorithm for constructing (local) stochastic matrices based on a given gossip network topology and a set of weights for averaging across different consensus clusters, ensuring that the gossip process converges to a finite limit set.

local stochastic matrix, matrix, stochastic matrix, (16 more...)

arXiv.org Artificial Intelligence

2502.19821

Country: North America > United States > Illinois > Champaign County > Urbana (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Add feedback

AMR-Evol: Adaptive Modular Response Evolution Elicits Better Knowledge Distillation for Large Language Models in Code Generation

Luo, Ziyang, Li, Xin, Lin, Hongzhan, Ma, Jing, Bing, Lidong

arXiv.org Artificial IntelligenceOct-1-2024

The impressive performance of proprietary LLMs like GPT4 in code generation has led to a trend to replicate these capabilities in open-source models through knowledge distillation (e.g. Code Evol-Instruct). However, these efforts often neglect the crucial aspect of response quality, relying heavily on teacher models for direct response distillation. This paradigm, especially for complex instructions, can degrade the quality of synthesized data, compromising the knowledge distillation process. To this end, our study introduces the Adaptive Modular Response Evolution (AMR-Evol) framework, which employs a two-stage process to refine response distillation. The first stage, modular decomposition, breaks down the direct response into more manageable sub-modules. The second stage, adaptive response evolution, automatically evolves the response with the related function modules. Our experiments with three popular code benchmarks (HumanEval, MBPP, and EvalPlus) attest to the superiority of the AMR-Evol framework over baseline response distillation methods. By comparing with the open-source Code LLMs trained on a similar scale of data, we observed performance enhancements: more than +3.0 points on HumanEval-Plus and +1.0 points on MBPP-Plus, which underscores the effectiveness of our framework. Our codes are available at https://github.com/ChiYeungLaw/AMR-Evol.

determinant, distillation, matrix, (17 more...)

arXiv.org Artificial Intelligence

2410.00558

Country: