Beyond Slow Signs in High-fidelity Model Extraction

Neural Information Processing Systems 

Deep neural networks, costly to train and rich in intellectual property value, areincreasingly threatened by model extraction attacks that compromise their confiden-tiality. Previous attacks have succeeded in reverse-engineering model parametersup to a precision of float64 for models trained on random data with at most threehidden layers using cryptanalytical techniques. However, the process was identifiedto be very time consuming and not feasible for larger and deeper models trained onstandard benchmarks. Our study evaluates the feasibility of parameter extractionmethods of Carlini et al. [1] further enhanced by Canales-Martínez et al. [2] formodels trained on standard benchmarks. We introduce a unified codebase thatintegrates previous methods and reveal that computational tools can significantlyinfluence performance.