AITopics

2502.10682

Genre: Research Report (0.69)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.53)

arXiv.org Machine LearningDec-13-2024

DP-CDA: An Algorithm for Enhanced Privacy Preservation in Dataset Synthesis Through Randomized Mixing

Saha, Utsab, Tonoy, Tanvir Muntakim, Imtiaz, Hafiz

In recent years, the growth of data across various sectors, including healthcare, security, finance, and education, has created significant opportunities for analysis and informed decision-making. However, these datasets often contain sensitive and personal information, which raises serious privacy concerns. Protecting individual privacy is crucial, yet many existing machine learning and data publishing algorithms struggle with high-dimensional data, facing challenges related to computational efficiency and privacy preservation. To address these challenges, we introduce an effective data publishing algorithm \emph{DP-CDA}. Our proposed algorithm generates synthetic datasets by randomly mixing data in a class-specific manner, and inducing carefully-tuned randomness to ensure formal privacy guarantees. Our comprehensive privacy accounting shows that DP-CDA provides a stronger privacy guarantee compared to existing methods, allowing for better utility while maintaining strict level of privacy. To evaluate the effectiveness of DP-CDA, we examine the accuracy of predictive models trained on the synthetic data, which serves as a measure of dataset utility. Importantly, we identify an optimal order of mixing that balances privacy guarantee with predictive accuracy. Our results indicate that synthetic datasets produced using the DP-CDA can achieve superior utility compared to those generated by traditional data publishing algorithms, even when subject to the same privacy requirements.

artificial intelligence, data mining, machine learning, (19 more...)

2411.16121

Country: Asia (0.46)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

arXiv.org Artificial IntelligenceNov-29-2024

Contextual Checkerboard Denoise -- A Novel Neural Network-Based Approach for Classification-Aware OCT Image Denoising

Islam, Md. Touhidul, Chowdhury, Md. Abtahi M., Salekin, Sumaiya, Maung, Aye T., Taki, Akil A., Imtiaz, Hafiz

In contrast to non-medical image denoising, where enhancing image clarity is the primary goal, medical image denoising warrants preservation of crucial features without introduction of new artifacts. However, many denoising methods that improve the clarity of the image, inadvertently alter critical information of the denoised images, potentially compromising classification performance and diagnostic quality. Additionally, supervised denoising methods are not very practical in medical image domain, since a \emph{ground truth} denoised version of a noisy medical image is often extremely challenging to obtain. In this paper, we tackle both of these problems by introducing a novel neural network based method -- \emph{Contextual Checkerboard Denoising}, that can learn denoising from only a dataset of noisy images, while preserving crucial anatomical details necessary for image classification/analysis. We perform our experimentation on real Optical Coherence Tomography (OCT) images, and empirically demonstrate that our proposed method significantly improves image quality, providing clearer and more detailed OCT images, while enhancing diagnostic accuracy.

artificial intelligence, machine learning, medical image, (17 more...)

2411.19549

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceApr-11-2023

Privacy-Preserving Matrix Factorization for Recommendation Systems using Gaussian Mechanism

Mugdho, Sohan Salahuddin, Imtiaz, Hafiz

Building a recommendation system involves analyzing user data, which can potentially leak sensitive information about users. Anonymizing user data is often not sufficient for preserving user privacy. Motivated by this, we propose a privacy-preserving recommendation system based on the differential privacy framework and matrix factorization, which is one of the most popular algorithms for recommendation systems. As differential privacy is a powerful and robust mathematical framework for designing privacy-preserving machine learning algorithms, it is possible to prevent adversaries from extracting sensitive user information even if the adversary possesses their publicly available (auxiliary) information. We implement differential privacy via the Gaussian mechanism in the form of output perturbation and release user profiles that satisfy privacy definitions. We employ R\'enyi Differential Privacy for a tight characterization of the overall privacy loss. We perform extensive experiments on real data to demonstrate that our proposed algorithm can offer excellent utility for some parameter choices, while guaranteeing strict privacy.

algorithm, artificial intelligence, machine learning, (14 more...)

2304.09096

Country: North America > United States (0.68)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceDec-26-2022

Human Activity Recognition from Wi-Fi CSI Data Using Principal Component-Based Wavelet CNN

Showmik, Ishtiaque Ahmed, Sanam, Tahsina Farah, Imtiaz, Hafiz

Human Activity Recognition (HAR) is an emerging technology with several applications in surveillance, security, and healthcare sectors. Noninvasive HAR systems based on Wi-Fi Channel State Information (CSI) signals can be developed leveraging the quick growth of ubiquitous Wi-Fi technologies, and the correlation between CSI dynamics and body motions. In this paper, we propose Principal Component-based Wavelet Convolutional Neural Network (or PCWCNN) -- a novel approach that offers robustness and efficiency for practical real-time applications. Our proposed method incorporates two efficient preprocessing algorithms -- the Principal Component Analysis (PCA) and the Discrete Wavelet Transform (DWT). We employ an adaptive activity segmentation algorithm that is accurate and computationally light. Additionally, we used the Wavelet CNN for classification, which is a deep convolutional network analogous to the well-studied ResNet and DenseNet networks. We empirically show that our proposed PCWCNN model performs very well on a real dataset, outperforming existing approaches.

activity recognition, data quality, machine learning, (21 more...)

2212.13161

Genre: Research Report (1.00)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceDec-9-2022

Privacy-preserving Non-negative Matrix Factorization with Outliers

Saha, Swapnil, Imtiaz, Hafiz

Non-negative matrix factorization is a popular unsupervised machine learning algorithm for extracting meaningful features from data which are inherently non-negative. However, such data sets may often contain privacy-sensitive user data, and therefore, we may need to take necessary steps to ensure the privacy of the users while analyzing the data. In this work, we focus on developing a Non-negative matrix factorization algorithm in the privacy-preserving framework. More specifically, we propose a novel privacy-preserving algorithm for non-negative matrix factorisation capable of operating on private data, while achieving results comparable to those of the non-private algorithm. We design the framework such that one has the control to select the degree of privacy grantee based on the utility gap. We show our proposed framework's performance in six real data sets. The experimental results show that our proposed method can achieve very close performance with the non-private algorithm under some parameter regime, while ensuring strict privacy.

data mining, machine learning, natural language, (17 more...)

2211.01451

Country:

Europe > United Kingdom (0.68)
North America > United States (0.46)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.94)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)

arXiv.org Machine LearningOct-28-2019

Improved Differentially Private Decentralized Source Separation for fMRI Data

Imtiaz, Hafiz, Mohammadi, Jafar, Silva, Rogers, Baker, Bradley, Plis, Sergey M., Sarwate, Anand D., Calhoun, Vince

Blind source separation algorithms such as independent component analysis (ICA) are widely used in the analysis of neuroimaging data. In order to leverage larger sample sizes, different data holders/sites may wish to collaboratively learn feature representations. However, such datasets are often privacy-sensitive, precluding centralized analyses that pool the data at a single site. A recently proposed algorithm uses message-passing between sites and a central aggregator to perform a decentralized joint ICA (djICA) without sharing the data. However, this method does not satisfy formal privacy guarantees. We propose a differentially private algorithm for performing ICA in a decentralized data setting. Differential privacy provides a formal and mathematically rigorous privacy guarantee by introducing noise into the messages. Conventional approaches to decentralized differentially private algorithms may require too much noise due to the typically small sample sizes at each site. We leverage a recently proposed correlated noise protocol to remedy the excessive noise problem of the conventional schemes. We investigate the performance of the proposed algorithm on synthetic and real fMRI datasets to show that our algorithm outperforms existing approaches and can sometimes reach the same level of utility as the corresponding non-private algorithm. This indicates that it is possible to have meaningful utility while preserving privacy.

algorithm, health & medicine, neurology, (21 more...)

1910.12913

Country: North America > United States (1.00)

Genre: Research Report > Experimental Study (0.66)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Security & Privacy (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Data Science (0.68)

arXiv.org Machine LearningApr-22-2019

Distributed Differentially Private Computation of Functions with Correlated Noise

Imtiaz, Hafiz, Mohammadi, Jafar, Sarwate, Anand D.

Many applications of machine learning, such as human health research, involve processing private or sensitive information. Privacy concerns may impose significant hurdles to collaboration in scenarios where there are multiple sites holding data and the goal is to estimate properties jointly across all datasets. Differentially private decentralized algorithms can provide strong privacy guarantees. However, the accuracy of the joint estimates may be poor when the datasets at each site are small. This paper proposes a new framework, Correlation Assisted Private Estimation (CAPE), for designing privacy-preserving decentralized algorithms with better accuracy guarantees in an honest-but-curious model. CAPE can be used in conjunction with the functional mechanism for statistical and machine learning optimization problems. A tighter characterization of the functional mechanism is provided that allows CAPE to achieve the same performance as a centralized algorithm in the decentralized setting using all datasets. Empirical results on regression and neural network problems for both synthetic and real datasets show that differentially private methods can be competitive with non-private algorithms in many scenarios of interest.

algorithm, neural network, optimization problem, (23 more...)

1904.10059

Country:

Europe (0.92)
North America > United States > New York (0.14)
North America > United States > New Jersey (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.92)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Communications (1.00)
(3 more...)

arXiv.org Machine LearningApr-26-2018

Distributed Differentially-Private Algorithms for Matrix and Tensor Factorization

Imtiaz, Hafiz, Sarwate, Anand D.

In many signal processing and machine learning applications, datasets containing private information are held at different locations, requiring the development of distributed privacy-preserving algorithms. Tensor and matrix factorizations are key components of many processing pipelines. In the distributed setting, differentially private algorithms suffer because they introduce noise to guarantee privacy. This paper designs new and improved distributed and differentially private algorithms for two popular matrix and tensor factorization methods: principal component analysis (PCA) and orthogonal tensor decomposition (OTD). The new algorithms employ a correlated noise design scheme to alleviate the effects of noise and can achieve the same noise level as the centralized scenario. Experiments on synthetic and real data illustrate the regimes in which the correlated noise allows performance matching with the centralized setting, outperforming previous methods and demonstrating that meaningful utility is possible while guaranteeing differential privacy.

algorithm, health & medicine, survey article, (18 more...)

1804.10299

Country: North America > United States > New Jersey (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)