Goto

Collaborating Authors

 deepclean


DeepClean: Machine Unlearning on the Cheap by Resetting Privacy Sensitive Weights using the Fisher Diagonal

Shi, Jiaeli, Ghalyan, Najah, Gourgoulias, Kostis, Buford, John, Moran, Sean

arXiv.org Artificial Intelligence

Machine learning models trained on sensitive or private data can inadvertently memorize and leak that information. Machine unlearning seeks to retroactively remove such details from model weights to protect privacy. We contribute a lightweight unlearning algorithm that leverages the Fisher Information Matrix (FIM) for selective forgetting. Prior work in this area requires full retraining or large matrix inversions, which are computationally expensive. Our key insight is that the diagonal elements of the FIM, which measure the sensitivity of log-likelihood to changes in weights, contain sufficient information for effective forgetting. Specifically, we compute the FIM diagonal over two subsets -- the data to retain and forget -- for all trainable weights. This diagonal representation approximates the complete FIM while dramatically reducing computation. We then use it to selectively update weights to maximize forgetting of the sensitive subset while minimizing impact on the retained subset. Experiments show that our algorithm can successfully forget any randomly selected subsets of training data across neural network architectures. By leveraging the FIM diagonal, our approach provides an interpretable, lightweight, and efficient solution for machine unlearning with practical privacy benefits.


DeepClean -- self-supervised artefact rejection for intensive care waveform data using generative deep learning

Edinburgh, Tom, Smielewski, Peter, Czosnyka, Marek, Eglen, Stephen J., Ercole, Ari

arXiv.org Machine Learning

Waveform physiological data is important in the treatment of critically ill patients in the intensive care unit. Such recordings are susceptible to artefacts, which must be removed before the data can be re-used for alerting or reprocessed for other clinical or research purposes. The current gold-standard is human annotation, which is painstaking when recordings span many days and has question marks over its reproducibility. In this work, we present DeepClean; a prototype self-supervised artefact detection system using a convolutional variational autoencoder deep neural network that avoids costly manual annotation, requiring only easily-obtained `good' data for training. For a test case with invasive arterial blood pressure, we demonstrate that our algorithm can detect the presence of an artefact within a 10-second sample of data with sensitivity and specificity around 90\%. Furthermore, DeepClean was able to identify regions of artefact within such samples with high accuracy and we show that it significantly outperforms a baseline principle component analysis approach in both signal reconstruction and artefact detection. DeepClean learns a generative model and therefore may also be used for imputation of missing data. Accurate removal of artefacts reduces both bias and uncertainty in clinical assessment and the false negative rate of intensive care unit alarms, and is therefore a key component in providing optimal clinical care.