Goto

Collaborating Authors

 maxse


Learning Rate Scheduling with Matrix Factorization for Private Training

arXiv.org Machine Learning

We study differentially private model training with stochastic gradient descent under learning rate scheduling and correlated noise. Although correlated noise, in particular via matrix factorizations, has been shown to improve accuracy, prior theoretical work focused primarily on the prefix-sum workload. That workload assumes a constant learning rate, whereas in practice learning rate schedules are widely used to accelerate training and improve convergence. We close this gap by deriving general upper and lower bounds for a broad class of learning rate schedules in both single- and multi-epoch settings. Building on these results, we propose a learning-rate-aware factorization that achieves improvements over prefix-sum factorizations under both MaxSE and MeanSE error metrics. Our theoretical analysis yields memory-efficient constructions suitable for practical deployment, and experiments on CIFAR-10 and IMDB datasets confirm that schedule-aware factorizations improve accuracy in private training.


Normalized Square Root: Sharper Matrix Factorization Bounds for Differentially Private Continual Counting

arXiv.org Artificial Intelligence

The factorization norms of the lower-triangular all-ones $n \times n$ matrix, $γ_2(M_{count})$ and $γ_{F}(M_{count})$, play a central role in differential privacy as they are used to give theoretical justification of the accuracy of the only known production-level private training algorithm of deep neural networks by Google. Prior to this work, the best known upper bound on $γ_2(M_{count})$ was $1 + \frac{\log n}π$ by Mathias (Linear Algebra and Applications, 1993), and the best known lower bound was $\frac{1}π(2 + \log(\frac{2n+1}{3})) \approx 0.507 + \frac{\log n}π$ (Matoušek, Nikolov, Talwar, IMRN 2020), where $\log$ denotes the natural logarithm. Recently, Henzinger and Upadhyay (SODA 2025) gave the first explicit factorization that meets the bound of Mathias (1993) and asked whether there exists an explicit factorization that improves on Mathias' bound. We answer this question in the affirmative. Additionally, we improve the lower bound significantly. More specifically, we show that $$ 0.701 + \frac{\log n}π + o(1) \;\leq\; γ_2(M_{count}) \;\leq\; 0.846 + \frac{\log n}π + o(1). $$ That is, we reduce the gap between the upper and lower bound to $0.14 + o(1)$. We also show that our factors achieve a better upper bound for $γ_{F}(M_{count})$ compared to prior work, and we establish an improved lower bound: $$ 0.701 + \frac{\log n}π + o(1) \;\leq\; γ_{F}(M_{count}) \;\leq\; 0.748 + \frac{\log n}π + o(1). $$ That is, the gap between the lower and upper bound provided by our explicit factorization is $0.047 + o(1)$.


Streaming Private Continual Counting via Binning

arXiv.org Artificial Intelligence

In differential privacy, $\textit{continual observation}$ refers to problems in which we wish to continuously release a function of a dataset that is revealed one element at a time. The challenge is to maintain a good approximation while keeping the combined output over all time steps differentially private. In the special case of $\textit{continual counting}$ we seek to approximate a sum of binary input elements. This problem has received considerable attention lately, in part due to its relevance in implementations of differentially private stochastic gradient descent. $\textit{Factorization mechanisms}$ are the leading approach to continual counting, but the best such mechanisms do not work well in $\textit{streaming}$ settings since they require space proportional to the size of the input. In this paper, we present a simple approach to approximating factorization mechanisms in low space via $\textit{binning}$, where adjacent matrix entries with similar values are changed to be identical in such a way that a matrix-vector product can be maintained in sublinear space. Our approach has provable sublinear space guarantees for a class of lower triangular matrices whose entries are monotonically decreasing away from the diagonal. We show empirically that even with very low space usage we are able to closely match, and sometimes surpass, the performance of asymptotically optimal factorization mechanisms. Recently, and independently of our work, Dvijotham et al. have also suggested an approach to implementing factorization mechanisms in a streaming setting. Their work differs from ours in several respects: It only addresses factorization into $\textit{Toeplitz}$ matrices, only considers $\textit{maximum}$ error, and uses a different technique based on rational function approximation that seems less versatile than our binning approach.