Goto

Collaborating Authors

 scatter matrix


On the Spectral Structure and Objective Equivalence of Orthogonal Multilabel Fisher Discriminants

arXiv.org Machine Learning

We provide a unified theoretical analysis of Linear Discriminant Analysis with simultaneous multilabel scatter matrix formulations and Stiefel orthogonality constraints. Our contributions span both algebraic structure and statistical guarantees. On the algebraic side, we characterize the rank of the multilabel between-class scatter matrix, showing that the effective discriminant dimensionality can strictly exceed the classical single-label bound of $C-1$; we establish a multilabel partition of variance and prove that all four Fisher objectives are equivalent under the $W^\top S_t^{ML} W = I_r$ constraint while characterizing their divergence under the Stiefel constraint; and we prove a two-sided label-distance preservation bound relating projected distances to Hamming distances in label space. On the statistical side, we establish a finite-sample $O(k_{\max}\sqrt{d\log d/n}/gap_r)$ bound on the subspace estimation error under sub-Gaussian noise with a matching $Ω(σ^2 d/(n\,gap_r))$ minimax lower bound, establishing a near-minimax-optimal rate (matching up to logarithmic and $k_{\max}$ factors) for multilabel discriminant subspace estimation. We further provide high-probability distance concentration, robustness guarantees under label interactions, and a regularization analysis preserving the spectral structure when $d \gg n$. All results are verified numerically on synthetic data generated from the linear label-effect model, covering both the algebraic identities and the multilabel-specific quantities ($k_{\max}$, $κ(S_t^{ML})$, $\|Γ/n\|_2$, $Δ_r$) that govern the statistical bounds. The numerical experiments are designed as a sanity check for the theorems rather than as an empirical benchmark; evaluation on real multilabel datasets is left to future work targeting application-oriented venues.




Generalized Laplacian Eigenmaps

Neural Information Processing Systems

Graph contrastive learning attracts/disperses node representations for similar/dissimilar node pairs under some notion of similarity. It may be combined with a low-dimensional embedding of nodes to preserve intrinsic and structural properties of a graph. COLES, a recent graph contrastive method combines traditional graph embedding and negative sampling into one framework. COLES in fact minimizes the trace difference between the within-class scatter matrix encapsulating the graph connectivity and the total scatter matrix encapsulating negative sampling. In this paper, we propose a more essential framework for graph embedding, called Generalized Laplacian EigeNmaps (GLEN), which learns a graph representation by maximizing the rank difference between the total scatter matrix and the within-class scatter matrix, resulting in the minimum class separation guarantee. However, the rank difference minimization is an NP-hard problem. Thus, we replace the trace difference that corresponds to the difference of nuclear norms by the difference of LogDet expressions, which we argue is a more accurate surrogate for the NP-hard rank difference than the trace difference. While enjoying a lesser computational cost, the difference of LogDet terms is lower-bounded by the Affine-invariant Riemannian metric (AIRM) and Jesen-Bregman the LogDet Divergence (JBLD), and upper-bounded by AIRM scaled by the factor of $\sqrt{m}$. We show that GLEN offers favourable accuracy/scalability compared to state-of-the-art baselines.


In-Process Monitoring of Gear Power Honing Using Vibration Signal Analysis and Machine Learning

arXiv.org Artificial Intelligence

In modern gear manufacturing, stringent Noise, Vibration, and Harshness (NVH) requirements demand high-precision finishing operations such as power honing. Conventional quality control strategies rely on post-process inspections and Statistical Process Control (SPC), which fail to capture transient machining anomalies and cannot ensure real-time defect detection. This study proposes a novel, data-driven framework for in-process monitoring of gear power honing using vibration signal analysis and machine learning. Our proposed methodology involves continuous data acquisition via accelerometers, followed by time-frequency signal analysis. We investigate and compare the efficacy of three subspace learning methods for features extraction: (1) Principal Component Analysis (PCA) for dimensionality reduction; (2) a two-stage framework combining PCA with Linear Discriminant Analysis (LDA) for enhanced class separation; and (3) Uncorrelated Multilinear Discriminant Analysis with Regularization (R-UMLDA), adapted for tensor data, which enforces feature decorrelation and includes regularization for small sample sizes. These extracted features are then fed into a Support Vector Machine (SVM) classifier to predict four distinct gear quality categories, established through rigorous geometrical inspections and test bench results of assembled gearboxes. The models are trained and validated on an experimental dataset collected in an industrial context during gear power-honing operations, with gears classified into four different quality categories. The proposed framework achieves high classification accuracy (up to 100%) in an industrial setting. The approach offers interpretable spectral features that correlate with process dynamics, enabling practical integration into real-time monitoring and predictive maintenance systems.



parts of the proposed method might not be explained enough, which might make it difficult to appreciate some of the

Neural Information Processing Systems

We thank all the reviewers for the responses and detailed comments. The first difference between the earlier SRM versus our SSTL lies in defining the shared space. Empirical studies in [3] also showed that the original forms of SRM and HA ( i.e., the Y es, this subject ordering can matter, but this is fairly standard -- i.e., The revised version will explicitly summarize the entire training and performance processes. Reviewer 1: Thank you for your insightful comments. Instead, we said that'scatter matrices The revision will address all of those comments.


Efficient Estimation of Regularized Tyler's M-Estimator Using Approximate LOOCV

arXiv.org Machine Learning

We consider the problem of estimating a regularization parameter, or a shrinkage coefficient $α\in (0,1)$ for Regularized Tyler's M-estimator (RTME). In particular, we propose to estimate an optimal shrinkage coefficient by setting $α$ as the solution to a suitably chosen objective function; namely the leave-one-out cross-validated (LOOCV) log-likelihood loss. Since LOOCV is computationally prohibitive even for moderate sample size $n$, we propose a computationally efficient approximation for the LOOCV log-likelihood loss that eliminates the need for invoking the RTME procedure $n$ times for each sample left out during the LOOCV procedure. This approximation yields an $O(n)$ reduction in the running time complexity for the LOOCV procedure, which results in a significant speedup for computing the LOOCV estimate. We demonstrate the efficiency and accuracy of the proposed approach on synthetic high-dimensional data sampled from heavy-tailed elliptical distributions, as well as on real high-dimensional datasets for object recognition, face recognition, and handwritten digit's recognition. Our experiments show that the proposed approach is efficient and consistently more accurate than other methods in the literature for shrinkage coefficient estimation.


Generalized Laplacian Eigenmaps

Neural Information Processing Systems

Graph contrastive learning attracts/disperses node representations for similar/dissimilar node pairs under some notion of similarity. It may be combined with a low-dimensional embedding of nodes to preserve intrinsic and structural properties of a graph. COLES, a recent graph contrastive method combines traditional graph embedding and negative sampling into one framework. COLES in fact minimizes the trace difference between the within-class scatter matrix encapsulating the graph connectivity and the total scatter matrix encapsulating negative sampling. In this paper, we propose a more essential framework for graph embedding, called Generalized Laplacian EigeNmaps (GLEN), which learns a graph representation by maximizing the rank difference between the total scatter matrix and the within-class scatter matrix, resulting in the minimum class separation guarantee.


Document Author Classification Using Parsed Language Structure

arXiv.org Artificial Intelligence

Over the years there has been ongoing interest in detecting authorship of a text based on statistical properties of the text, such as by using occurrence rates of noncontextual words. In previous work, these techniques have been used, for example, to determine authorship of all of \emph{The Federalist Papers}. Such methods may be useful in more modern times to detect fake or AI authorship. Progress in statistical natural language parsers introduces the possibility of using grammatical structure to detect authorship. In this paper we explore a new possibility for detecting authorship using grammatical structural information extracted using a statistical natural language parser. This paper provides a proof of concept, testing author classification based on grammatical structure on a set of "proof texts," The Federalist Papers and Sanditon which have been as test cases in previous authorship detection studies. Several features extracted from the statistical natural language parser were explored: all subtrees of some depth from any level; rooted subtrees of some depth, part of speech, and part of speech by level in the parse tree. It was found to be helpful to project the features into a lower dimensional space. Statistical experiments on these documents demonstrate that information from a statistical parser can, in fact, assist in distinguishing authors.