spectral
Universal Sequence Preconditioning
We study the problem of preconditioning in sequential prediction. From the theoretical lens of linear dynamical systems, we show that convolving the target sequence corresponds to applying a polynomial to the hidden transition matrix. Building on this insight, we propose a universal preconditioning method that convolves the target with coefficients from orthogonal polynomials such as Chebyshev or Legendre. We prove that this approach reduces regret for two distinct prediction algorithms and yields the first ever sublinear and hidden-dimension-independent regret bounds (up to logarithmic factors) that hold for systems with marginally stable and asymmetric transition matrices. Finally, extensive synthetic and realworld experiments show that this simple preconditioning strategy improves the performance of a diverse range of algorithms, including recurrent neural networks, and generalizes to signals beyond linear dynamical systems.
Structure-Aware Spectral Sparsification via Uniform Edge Sampling
Spectral clustering is a fundamental method for graph partitioning, but its reliance on eigenvector computation limits scalability to massive graphs. Classical sparsification methods preserve spectral properties by sampling edges proportionally to their effective resistances, but require expensive preprocessing to estimate these resistances. We study whether uniform edge sampling--a simple, structure-agnostic strategy--can suffice for spectral clustering. Our main result shows that for graphs admitting a well-separated k-clustering, characterized by a large structure ratio ฮฅ(k) = ฮปk+1/ฯG(k), uniform sampling preserves the spectral subspace used for clustering. Specifically, we prove that uniformly sampling O(ฮณ2nlogn/ฮต2) edges, where ฮณ is the Laplacian condition number, yields a sparsifier whose top (n k)dimensional eigenspace is approximately orthogonal to the cluster indicators.
Structure-Aware Spectral Sparsification via Uniform Edge Sampling
Spectral clustering is a fundamental method for graph partitioning, but its reliance on eigenvector computation limits scalability to massive graphs. Classical sparsification methods preserve spectral properties by sampling edges proportionally to their effective resistances, but require expensive preprocessing to estimate these resistances. We study whether uniform edge sampling--a simple, structure-agnostic strategy--can suffice for spectral clustering. Our main result shows that for graphs admitting a well-separated $k$-clustering, characterized by a large structure ratio $\Upsilon(k) = \lambda_{k+1} / \rho_G(k)$, uniform sampling preserves the spectral subspace used for clustering. Specifically, we prove that uniformly sampling $O(\gamma^2 n \log n / \varepsilon^2)$ edges, where $\gamma$ is the Laplacian condition number, yields a sparsifier whose top $(n-k)$-dimensional eigenspace is approximately orthogonal to the cluster indicators.
Amortized Neural Clustering of Time Series based on Statistical Features
Lรณpez-Oriona, รngel, Sun, Ying
This paper introduces an algorithm-agnostic approach to feature-based time series clustering via amortized neural inference. By training neural networks to approximate the optimal partitioning rule from simulated data, the proposed framework reduces reliance on conventional clustering methods, such as $K$-means, $K$-medoids, or hierarchical clustering, and their associated objective functions and heuristics. Leveraging statistical features, such as autocorrelations and quantile autocorrelations, the approach learns a data-driven affinity structure from which clustering partitions can be recovered, without requiring explicit prior specification of cluster shapes or structures. In addition, one version of the method can automatically determine the number of clusters, avoiding ad-hoc selection procedures. Comprehensive empirical studies show that the proposed framework achieves competitive or superior clustering accuracy relative to traditional methods, even in challenging scenarios where competing techniques are provided with the true number of clusters. An application to financial time series of stock returns illustrates its practical utility. By reducing the need for algorithm selection and calibration, the proposed framework opens new possibilities for automated, adaptive, and data-driven clustering of temporal data across scientific and industrial domains.
Boosting Spectral Clustering on Incomplete Data via Kernel Correction and Affinity Learning
Spectral clustering has gained popularity for clustering non-convex data due to its simplicity and effectiveness. It is essential to construct a similarity graph using a high-quality affinity measure that models the local neighborhood relations among the data samples. However, incomplete data can lead to inaccurate affinity measures, resulting in degraded clustering performance. To address these issues, we propose an imputation-free framework with two novel approaches to improve spectral clustering on incomplete data. Firstly, we introduce a new kernel correction method that enhances the quality of the kernel matrix estimated on incomplete data with a theoretical guarantee, benefiting classical spectral clustering on pre-defined kernels. Secondly, we develop a series of affinity learning methods that equip the selfexpressive framework with โp-norm to construct an intrinsic affinity matrix with an adaptive extension. Our methods outperform existing data imputation and distance calibration techniques on benchmark datasets, offering a promising solution to spectral clustering on incomplete data in various real-world applications.
5446f217e9504bc593ad9dcf2ec88dda-Supplemental.pdf
Python notebooks producing the figures of this paper are available at https://github.com/ Let F be the joint distribution on RTD obtained by first assigning a random vector =( 1| | T) via F and then applying the map to each of the components t, and let F 1,..., F T denote the corresponding marginal distributions on RD. Given F and t F t, define the second moment matrices = E[ >] 2 RTD TD and t = E[ t >t ] 2 RD D, and let r = rank() and rt = rank( t). Let M 2 Rr TD be a matrix whose rows form a basis of supp(F), and similarly let Nt 2 Rrt D be a matrix whose rows form a basis of supp(F t). Let N = diag(N1,..., NT), and define the Writing V =( V1| | VT), where Vt 2 Rrt d has rank dr, we can then construct the singular value decompositions Vt = Ut tW>t, with Ut 2 O(rt dt), t 2 Rdt dt and Wt 2 O(d dt), where dt = rank(Vt).
Spectral embedding for dynamic networks with stability guarantees
We consider the problem of embedding a dynamic network, to obtain time-evolving vector representations of each node, which can then be used to describe changes in behaviour of individual nodes, communities, or the entire graph. Given this open-ended remit, we argue that two types of stability in the spatio-temporal positioning of nodes are desirable: to assign the same position, up to noise, to nodes behaving similarly at a given time (cross-sectional stability) and a constant position, up to noise, to a single node behaving similarly across different times (longitudinal stability). Similarity in behaviour is defined formally using notions of exchangeability under a dynamic latent position network model. By showing how this model can be recast as a multilayer random dot product graph, we demonstrate that unfolded adjacency spectral embedding satisfies both stability conditions. We also show how two alternative methods, omnibus and independent spectral embedding, alternately lack one or the other form of stability.
Matrix factorisation and the interpretation of geodesic distance
Given a graph or similarity matrix, we consider the problem of recovering a notion of true distance between the nodes, and so their true positions. We show that this can be accomplished in two steps: matrix factorisation, followed by nonlinear dimension reduction. This combination is effective because the point cloud obtained in the first step lives close to a manifold in which latent distance is encoded as geodesic distance. Hence, a nonlinear dimension reduction tool, approximating geodesic distance, can recover the latent positions, up to a simple transformation. We give a detailed account of the case where spectral embedding is used, followed by Isomap, and provide encouraging experimental evidence for other combinations of techniques.
Supplementary Materials for the Paper " Towards Free Data Selection with General-Purpose Models " Anonymous Author(s) Affiliation Address email
In this supplementary material, we first explain the details of spectral clustering algorithm in Sec. B. We also analyze the sensitivity of FreeSel to the values of hyperparameters in3 Sec. C. Besides, FreeSel is compared with other intuitive baselines using the general-purpose model4 in Sec. D. Finally, implementation details of our experiments are explained in Sec. E. Our code will5 be made publicly available.6 In this section, we explain the spectral clustering algorithm [14, 18] in the semantic pattern extraction8 process for each image I (Sec.