Goto

Collaborating Authors

 Mathematical & Statistical Methods


Robust Detection of Planted Subgraphs in Semi-Random Models

arXiv.org Artificial Intelligence

Detection of planted subgraphs in Erdรถs-Rรฉnyi random graphs has been extensively studied, leading to a rich body of results characterizing both statistical and computational thresholds. However, most prior work assumes a purely random generative model, making the resulting algorithms potentially fragile in the face of real-world perturbations. In this work, we initiate the study of semi-random models for the planted subgraph detection problem, wherein an adversary is allowed to remove edges outside the planted subgraph before the graph is revealed to the statistician. Crucially, the statistician remains unaware of which edges have been removed, introducing fundamental challenges to the inference task. We establish fundamental statistical limits for detection under this semi-random model, revealing a sharp dichotomy. Specifically, for planted subgraphs with strongly sub-logarithmic maximum density detection becomes information-theoretically impossible in the presence of an adversary, despite being possible in the classical random model. In stark contrast, for subgraphs with super-logarithmic density, the statistical limits remain essentially unchanged; we prove that the optimal (albeit computationally intractable) likelihood ratio test remains robust. Beyond these statistical boundaries, we design a new computationally efficient and robust detection algorithm, and provide rigorous statistical guarantees for its performance. Our results establish the first robust framework for planted subgraph detection and open new directions in the study of semi-random models, computational-statistical trade-offs, and robustness in graph inference problems.


FedLAD: A Linear Algebra Based Data Poisoning Defence for Federated Learning

arXiv.org Artificial Intelligence

Sybil attacks pose a significant threat to federated learning, as malicious nodes can collaborate and gain a majority, thereby overwhelming the system. Therefore, it is essential to develop countermeasures that ensure the security of federated learning environments. We present a novel defence method against targeted data poisoning, which is one of the types of Sybil attacks, called Linear Algebra-based Detection (FedLAD). Unlike existing approaches, such as clustering and robust training, which struggle in situations where malicious nodes dominate, FedLAD models the federated learning aggregation process as a linear problem, transforming it into a linear algebra optimisation challenge. This method identifies potential attacks by extracting the independent linear combinations from the original linear combinations, effectively filtering out redundant and malicious elements. Extensive experimental evaluations demonstrate the effectiveness of FedLAD compared to five well-established defence methods: Sherpa, CONTRA, Median, Trimmed Mean, and Krum. Using tasks from both image classification and natural language processing, our experiments confirm that FedLAD is robust and not dependent on specific application settings. The results indicate that FedLAD effectively protects federated learning systems across a broad spectrum of malicious node ratios. Compared to baseline defence methods, FedLAD maintains a low attack success rate for malicious nodes when their ratio ranges from 0.2 to 0.8. Additionally, it preserves high model accuracy when the malicious node ratio is between 0.2 and 0.5. These findings underscore FedLAD's potential to enhance both the reliability and performance of federated learning systems in the face of data poisoning attacks.


A Smoothing Newton Method for Rank-one Matrix Recovery

arXiv.org Machine Learning

--We consider the phase retrieval problem, which involves recovering a rank-one positive semidefinite matrix from rank-one measurements. A recently proposed algorithm based on Bures-Wasserstein gradient descent (BWGD) exhibits superlinear convergence, but it is unstable, and existing theory can only prove local linear convergence for higher rank matrix recovery. We resolve this gap by revealing that BWGD implements Newton's method with a nonsmooth and nonconvex objective. We develop a smoothing framework that regularizes the objective, enabling a stable method with rigorous superlinear convergence guarantees. Experiments on synthetic data demonstrate this superior stability while maintaining fast convergence. Phase retrieval--the problem of recovering a real or complex signal from magnitude-only measurements--is a fundamental problem in signal processing. Its applications range from X-ray crystallography to astronomical imaging, where measurement systems capture a form of intensity [Harrison, 1993, Fienup, 1982, Fienup and Dainty, 1987, Miao et al., 1998]. The seemingly simple constraint of measuring magnitudes transforms what would be a linear problem into a challenging nonlinear and nonconvex optimization problem. More critically, the direct optimization formulation using least squares yields a nonconvex objective function, making it difficult to solve effectively. The phase retrieval problem is a specific instance of a broader class of low-rank matrix sensing problems that arise throughout signal processing [Recht et al., 2010].


Typing Tensor Calculus in 2-Categories (I)

arXiv.org Artificial Intelligence

To formalize calculations in linear algebra for the development of efficient algorithms and a framework suitable for functional programming languages and faster parallelized computations, we adopt an approach that treats elements of linear algebra, such as matrices, as morphisms in the category of matrices, $\mathbf{Mat_{k}}$. This framework is further extended by generalizing the results to arbitrary monoidal semiadditive categories. To enrich this perspective and accommodate higher-rank matrices (tensors), we define semiadditive 2-categories, where matrices $T_{ij}$ are represented as 1-morphisms, and tensors with four indices $T_{ijkl}$ as 2-morphisms. This formalization provides an index-free, typed linear algebra framework that includes matrices and tensors with up to four indices. Furthermore, we extend the framework to monoidal semiadditive 2-categories and demonstrate detailed operations and vectorization within the 2-category of 2Vec introduced by Kapranov and Voevodsky.


Thermodynamics-Inspired Computing with Oscillatory Neural Networks for Inverse Matrix Computation

arXiv.org Artificial Intelligence

We describe a thermodynamic-inspired computing paradigm based on oscillatory neural networks (ONNs). While ONNs have been widely studied as Ising machines for tackling complex combinatorial optimization problems, this work investigates their feasibility in solving linear algebra problems, specifically the inverse matrix. Grounded in thermodynamic principles, we analytically demonstrate that the linear approximation of the coupled Kuramoto oscillator model leads to the inverse matrix solution. Numerical simulations validate the theoretical framework, and we examine the parameter regimes that computation has the highest accuracy.


Feed-anywhere ANN (I) Steady Discrete $\to$ Diffusing on Graph Hidden States

arXiv.org Machine Learning

We propose a novel framework for learning hidden graph structures from data using geometric analysis and nonlinear dynamics. Our approach: (1) Defines discrete Sobolev spaces on graphs for scalar/vector fields, establishing key functional properties; (2) Introduces gauge-equivalent nonlinear Schrรถdinger and Landau--Lifshitz dynamics with provable stable stationary solutions smoothly dependent on input data and graph weights; (3) Develops a stochastic gradient algorithm over graph moduli spaces with sparsity regularization. Theoretically, we guarantee: topological correctness (homology recovery), metric convergence (Gromov--Hausdorff), and efficient search space utilization. Our dynamics-based model achieves stronger generalization bounds than standard neural networks, with complexity dependent on the data manifold's topology.


Federated Calculation of the Free-Support Transportation Barycenter by Single-Loop Dual Decomposition

arXiv.org Machine Learning

We propose an efficient federated dual decomposition algorithm for calculating the Wasserstein barycenter of several distributions, including choosing the support of the solution. The algorithm does not access local data and uses only highly aggregated information. It also does not require repeated solutions to mass transportation problems. Because of the absence of any matrix-vector operations, the algorithm exhibits a very low complexity of each iteration and significant scalability. We illustrate its virtues and compare it to the state-of-the-art methods on several examples of mixture models.


Query Efficient Structured Matrix Learning

arXiv.org Artificial Intelligence

We study the problem of learning a structured approximation (low-rank, sparse, banded, etc.) to an unknown matrix $A$ given access to matrix-vector product (matvec) queries of the form $x \rightarrow Ax$ and $x \rightarrow A^Tx$. This problem is of central importance to algorithms across scientific computing and machine learning, with applications to fast multiplication and inversion for structured matrices, building preconditioners for first-order optimization, and as a model for differential operator learning. Prior work focuses on obtaining query complexity upper and lower bounds for learning specific structured matrix families that commonly arise in applications. We initiate the study of the problem in greater generality, aiming to understand the query complexity of learning approximations from general matrix families. Our main result focuses on finding a near-optimal approximation to $A$ from any finite-sized family of matrices, $\mathcal{F}$. Standard results from matrix sketching show that $O(\log|\mathcal{F}|)$ matvec queries suffice in this setting. This bound can also be achieved, and is optimal, for vector-matrix-vector queries of the form $x,y\rightarrow x^TAy$, which have been widely studied in work on rank-$1$ matrix sensing. Surprisingly, we show that, in the matvec model, it is possible to obtain a nearly quadratic improvement in complexity, to $\tilde{O}(\sqrt{\log|\mathcal{F}|})$. Further, we prove that this bound is tight up to log-log factors.Via covering number arguments, our result extends to well-studied infinite families. As an example, we establish that a near-optimal approximation from any \emph{linear matrix family} of dimension $q$ can be learned with $\tilde{O}(\sqrt{q})$ matvec queries, improving on an $O(q)$ bound achievable via sketching techniques and vector-matrix-vector queries.


A 'Grand Unified Theory' of Math Just Got a Little Bit Closer

WIRED

The original version of this story appeared in Quanta Magazine. In 1994, an earthquake of a proof shook up the mathematical world. The mathematician Andrew Wiles had finally settled Fermat's Last Theorem, a central problem in number theory that had remained open for over three centuries. The proof didn't just enthral mathematicians--it made the front page of The New York Times. But to accomplish it, Wiles (with help from the mathematician Richard Taylor) first had to prove a more subtle intermediate statement--one with implications that extended beyond Fermat's puzzle.


Trek-Based Parameter Identification for Linear Causal Models With Arbitrarily Structured Latent Variables

arXiv.org Machine Learning

We develop a criterion to certify whether causal effects are identifiable in linear structural equation models with latent variables. Linear structural equation models correspond to directed graphs whose nodes represent the random variables of interest and whose edges are weighted with linear coefficients that correspond to direct causal effects. In contrast to previous identification methods, we do not restrict ourselves to settings where the latent variables constitute independent latent factors (i.e., to source nodes in the graphical representation of the model). Our novel latent-subgraph criterion is a purely graphical condition that is sufficient for identifiability of causal effects by rational formulas in the covariance matrix. To check the latent-subgraph criterion, we provide a sound and complete algorithm that operates by solving an integer linear program. While it targets effects involving observed variables, our new criterion is also useful for identifying effects between latent variables, as it allows one to transform the given model into a simpler measurement model for which other existing tools become applicable.