Goto

Collaborating Authors

 Chowdhury, Amrita Roy


FairProof : Confidential and Certifiable Fairness for Neural Networks

arXiv.org Artificial Intelligence

Machine learning models are increasingly used in societal applications, yet legal and privacy concerns demand that they very often be kept confidential. Consequently, there is a growing distrust about the fairness properties of these models in the minds of consumers, who are often at the receiving end of model predictions. To this end, we propose FairProof - a system that uses Zero-Knowledge Proofs (a cryptographic primitive) to publicly verify the fairness of a model, while maintaining confidentiality. We also propose a fairness certification algorithm for fully-connected neural networks which is befitting to ZKPs and is used in this system. We implement FairProof in Gnark and demonstrate empirically that our system is practically feasible. Recent usage of ML models in high-stakes societal applications Khandani et al. (2010); Brennan et al. (2009); Datta et al. (2014) has raised serious concerns about their fairness (Angwin et al., 2016; Vigdor, November, 2019; Dastin, October 2018; Wallarchive & Schellmannarchive, June, 2021). As a result, there is growing distrust in the minds of a consumer at the receiving end of ML-based decisions Dwork & Minow (2022). In order to increase consumer trust, there is a need for developing technology that enables public verification of the fairness properties of these models. A major barrier to such verification is that legal and privacy concerns demand that models be kept confidential by organizations. The resulting lack of verifiability can lead to potential misbehavior, such as model swapping, wherein a malicious entity uses different models for different customers leading to unfair behavior. Therefore what is needed is a solution which allows for public verification of the fairness of a model and ensures that the same model is used for every prediction (model uniformity) while maintaining model confidentiality. The canonical approach to evaluating fairness is a statistics-based third-party audit Yadav et al. (2022); Yan & Zhang (2022); Pentyala et al. (2022).


Identifying and Mitigating the Security Risks of Generative AI

arXiv.org Artificial Intelligence

Every major technical invention resurfaces the dual-use dilemma -- the new technology has the potential to be used for good as well as for harm. Generative AI (GenAI) techniques, such as large language models (LLMs) and diffusion models, have shown remarkable capabilities (e.g., in-context learning, code-completion, and text-to-image generation and editing). However, GenAI can be used just as well by attackers to generate new attacks and increase the velocity and efficacy of existing attacks. This paper reports the findings of a workshop held at Google (co-organized by Stanford University and the University of Wisconsin-Madison) on the dual-use dilemma posed by GenAI. This paper is not meant to be comprehensive, but is rather an attempt to synthesize some of the interesting findings from the workshop. We discuss short-term and long-term goals for the community on this topic. We hope this paper provides both a launching point for a discussion on this important topic as well as interesting problems that the research community can work to address.


ShadowNet: A Secure and Efficient On-device Model Inference System for Convolutional Neural Networks

arXiv.org Artificial Intelligence

With the increased usage of AI accelerators on mobile and edge devices, on-device machine learning (ML) is gaining popularity. Thousands of proprietary ML models are being deployed today on billions of untrusted devices. This raises serious security concerns about model privacy. However, protecting model privacy without losing access to the untrusted AI accelerators is a challenging problem. In this paper, we present a novel on-device model inference system, ShadowNet. ShadowNet protects the model privacy with Trusted Execution Environment (TEE) while securely outsourcing the heavy linear layers of the model to the untrusted hardware accelerators. ShadowNet achieves this by transforming the weights of the linear layers before outsourcing them and restoring the results inside the TEE. The non-linear layers are also kept secure inside the TEE. ShadowNet's design ensures efficient transformation of the weights and the subsequent restoration of the results. We build a ShadowNet prototype based on TensorFlow Lite and evaluate it on five popular CNNs, namely, MobileNet, ResNet-44, MiniVGG, ResNet-404, and YOLOv4-tiny. Our evaluation shows that ShadowNet achieves strong security guarantees with reasonable performance, offering a practical solution for secure on-device model inference.


Can Membership Inferencing be Refuted?

arXiv.org Artificial Intelligence

Membership inference (MI) attack is currently the most popular test for measuring privacy leakage in machine learning models. Given a machine learning model, a data point and some auxiliary information, the goal of an MI attack is to determine whether the data point was used to train the model. In this work, we study the reliability of membership inference attacks in practice. Specifically, we show that a model owner can plausibly refute the result of a membership inference test on a data point $x$ by constructing a proof of repudiation that proves that the model was trained without $x$. We design efficient algorithms to construct proofs of repudiation for all data points of the training dataset. Our empirical evaluation demonstrates the practical feasibility of our algorithm by constructing proofs of repudiation for popular machine learning models on MNIST and CIFAR-10. Consequently, our results call for a re-evaluation of the implications of membership inference attacks in practice.


Data-Dependent Differentially Private Parameter Learning for Directed Graphical Models

arXiv.org Machine Learning

Directed graphical models (DGMs) are a class of probabilistic models that are widely used for predictive analysis in sensitive domains, such as medical diagnostics. In this paper we present an algorithm for differentially private learning of the parameters of a DGM with a publicly known graph structure over fully observed data. Our solution optimizes for the utility of inference queries over the DGM and \textit{adds noise that is customized to the properties of the private input dataset and the graph structure of the DGM}. To the best of our knowledge, this is the first explicit data-dependent privacy budget allocation algorithm for DGMs. We compare our algorithm with a standard data-independent approach over a diverse suite of DGM benchmarks and demonstrate that our solution requires a privacy budget that is $3\times$ smaller to obtain the same or higher utility.