Goto

Collaborating Authors

 dual point


Polynomial Time Cryptanalytic Extraction of Deep Neural Networks in the Hard-Label Setting

Carlini, Nicholas, Chávez-Saab, Jorge, Hambitzer, Anna, Rodríguez-Henríquez, Francisco, Shamir, Adi

arXiv.org Artificial Intelligence

Deep neural networks (DNNs) are valuable assets, yet their public accessibility raises security concerns about parameter extraction by malicious actors. Recent work by Carlini et al. (crypto'20) and Canales-Mart\'inez et al. (eurocrypt'24) has drawn parallels between this issue and block cipher key extraction via chosen plaintext attacks. Leveraging differential cryptanalysis, they demonstrated that all the weights and biases of black-box ReLU-based DNNs could be inferred using a polynomial number of queries and computational time. However, their attacks relied on the availability of the exact numeric value of output logits, which allowed the calculation of their derivatives. To overcome this limitation, Chen et al. (asiacrypt'24) tackled the more realistic hard-label scenario, where only the final classification label (e.g., "dog" or "car") is accessible to the attacker. They proposed an extraction method requiring a polynomial number of queries but an exponential execution time. In addition, their approach was applicable only to a restricted set of architectures, could deal only with binary classifiers, and was demonstrated only on tiny neural networks with up to four neurons split among up to two hidden layers. This paper introduces new techniques that, for the first time, achieve cryptanalytic extraction of DNN parameters in the most challenging hard-label setting, using both a polynomial number of queries and polynomial time. We validate our approach by extracting nearly one million parameters from a DNN trained on the CIFAR-10 dataset, comprising 832 neurons in four hidden layers. Our results reveal the surprising fact that all the weights of a ReLU-based DNN can be efficiently determined by analyzing only the geometric shape of its decision boundaries.


Dual Extrapolation for Sparse Generalized Linear Models

Massias, Mathurin, Vaiter, Samuel, Gramfort, Alexandre, Salmon, Joseph

arXiv.org Machine Learning

Generalized Linear Models (GLM) form a wide class of regression and classification models, where prediction is a function of a linear combination of the input variables. For statistical inference in high dimension, sparsity inducing regularizations have proven to be useful while offering statistical guarantees. However, solving the resulting optimization problems can be challenging: even for popular iterative algorithms such as coordinate descent, one needs to loop over a large number of variables. To mitigate this, techniques known as screening rules and working sets diminish the size of the optimization problem at hand, either by progressively removing variables, or by solving a growing sequence of smaller problems. For both techniques, significant variables are identified thanks to convex duality arguments. In this paper, we show that the dual iterates of a GLM exhibit a Vector AutoRegressive (VAR) behavior after sign identification, when the primal problem is solved with proximal gradient descent or cyclic coordinate descent. Exploiting this regularity, one can construct dual points that offer tighter certificates of optimality, enhancing the performance of screening rules and helping to design competitive working set algorithms.


Dual Extrapolation for Faster Lasso Solvers

Massias, Mathurin, Gramfort, Alexandre, Salmon, Joseph

arXiv.org Machine Learning

Convex sparsity-inducing regularizations are ubiquitous in high-dimension machine learning, but their non-differentiability requires the use of iterative solvers. To accelerate such solvers, state-of-the-art approaches consist in reducing the size of the optimization problem at hand. In the context of regression, this can be achieved either by discarding irrelevant features (screening techniques) or by prioritizing features likely to be included in the support of the solution (working set techniques). Duality comes into play at several steps in these techniques. Here, we propose an extrapolation technique starting from a sequence of iterates in the dual that leads to the construction of an improved dual point. This enables a tighter control of optimality as used in stopping criterion, as well as better screening performance of Gap Safe rules. Finally, we propose a working set strategy based on an aggressive use of Gap Safe rules and our new dual point construction, which improves state-of-the-art time performance on Lasso problems.


Subspace-Sparse Representation

You, C., Vidal, R.

arXiv.org Machine Learning

Given an overcomplete dictionary $A$ and a signal $b$ that is a linear combination of a few linearly independent columns of $A$, classical sparse recovery theory deals with the problem of recovering the unique sparse representation $x$ such that $b = A x$. It is known that under certain conditions on $A$, $x$ can be recovered by the Basis Pursuit (BP) and the Orthogonal Matching Pursuit (OMP) algorithms. In this work, we consider the more general case where $b$ lies in a low-dimensional subspace spanned by some columns of $A$, which are possibly linearly dependent. In this case, the sparsest solution $x$ is generally not unique, and we study the problem that the representation $x$ identifies the subspace, i.e. the nonzero entries of $x$ correspond to dictionary atoms that are in the subspace. Such a representation $x$ is called subspace-sparse. We present sufficient conditions for guaranteeing subspace-sparse recovery, which have clear geometric interpretations and explain properties of subspace-sparse recovery. We also show that the sufficient conditions can be satisfied under a randomized model. Our results are applicable to the traditional sparse recovery problem and we get conditions for sparse recovery that are less restrictive than the canonical mutual coherent condition. We also use the results to analyze the sparse representation based classification (SRC) method, for which we get conditions to show its correctness.