privacy preference
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- Europe > Germany (0.04)
- Asia (0.04)
Dynamic User-controllable Privacy-preserving Few-shot Sensing Framework
Chathoth, Ajesh Koyatan, Yu, Shuhao, Lee, Stephen
University of Pittsburgh Pittsburgh, P A, USA User-controllable privacy is important in modern sensing systems, as privacy preferences can vary significantly from person to person and may evolve over time. This is especially relevant in devices equipped with Inertial Measurement Unit (IMU) sensors, such as smartphones and wearables, which continuously collect rich time-series data that can inadvertently expose sensitive user behaviors. While prior work has proposed privacy-preserving methods for sensor data, most rely on static, predefined privacy labels or require large quantities of private training data, limiting their adaptability and user agency. In this work, we introduce PrivCLIP, a dynamic, user-controllable, few-shot privacy-preserving sensing framework. PrivCLIP allows users to specify and modify their privacy preferences by categorizing activities as sensitive (blacklisted), non-sensitive (white-listed), or neutral (gray-listed). Leveraging a multimodal contrastive learning approach, Priv-CLIP aligns IMU sensor data with natural language activity descriptions in a shared embedding space, enabling few-shot detection of sensitive activities. When a privacy-sensitive activity is identified, the system uses a language-guided activity sanitizer and a motion generation module (IMU-GPT) to transform the original data into a privacy-compliant version that semantically resembles a non-sensitive activity. We evaluate PrivCLIP on multiple human activity recognition datasets and demonstrate that it significantly outperforms baseline methods in terms of both privacy protection and data utility. A growing number of smart devices, including wearables and smartphones, are equipped with sensors that enable applications in health monitoring, fitness tracking, and human activity recognition (HAR). Among these, inertial measurement units (IMUs) are particularly useful, as they capture fine-grained motion data that can be used to infer user behavior, physical condition, and mobility patterns. Typically, this sensor data is collected and transmitted to third-party cloud services for large-scale sensing and analytics. In many applications, online data transmission is desirable. Online tracking facilitates data sharing with peers, which enhances user engagement by providing timely feedback and positive reinforcement, which can be critical for sustained participation. However, outsourcing data processing to third-party providers raises significant privacy concerns.
- North America > United States (0.96)
- Asia > Singapore (0.04)
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- Asia > South Korea (0.04)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Consumer Health (1.00)
Controlling What You Share: Assessing Language Model Adherence to Privacy Preferences
Ramírez, Guillem, Birch, Alexandra, Titov, Ivan
Large language models (LLMs) are primarily accessed via commercial APIs, but this often requires users to expose their data to service providers. In this paper, we explore how users can stay in control of their data by using privacy profiles: simple natural language instructions that say what should and should not be revealed. We build a framework where a local model uses these instructions to rewrite queries, only hiding details deemed sensitive by the user, before sending them to an external model, thus balancing privacy with performance. To support this research, we introduce PEEP, a multilingual dataset of real user queries annotated to mark private content and paired with synthetic privacy profiles. Experiments with lightweight local LLMs show that, after fine-tuning, they not only achieve markedly better privacy preservation but also match or exceed the performance of much larger zero-shot models. At the same time, the system still faces challenges in fully adhering to user instructions, underscoring the need for models with a better understanding of user-defined privacy preferences.
- Europe > Austria > Vienna (0.14)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- North America > United States > Indiana > Bartholomew County > Columbus (0.04)
- (7 more...)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- Europe > Germany (0.04)
- Asia (0.04)
FedRE: Robust and Effective Federated Learning with Privacy Preference
Xiao, Tianzhe, Li, Yichen, Zhou, Yu, Qi, Yining, Liu, Yi, Wang, Wei, Wang, Haozhao, Wang, Yi, Li, Ruixuan
Despite Federated Learning (FL) employing gradient aggregation at the server for distributed training to prevent the privacy leakage of raw data, private information can still be divulged through the analysis of uploaded gradients from clients. Substantial efforts have been made to integrate local differential privacy (LDP) into the system to achieve a strict privacy guarantee. However, existing methods fail to take practical issues into account by merely perturbing each sample with the same mechanism while each client may have their own privacy preferences on privacy-sensitive information (PSI), which is not uniformly distributed across the raw data. In such a case, excessive privacy protection from private-insensitive information can additionally introduce unnecessary noise, which may degrade the model performance. In this work, we study the PSI within data and develop FedRE, that can simultaneously achieve robustness and effectiveness benefits with LDP protection. More specifically, we first define PSI with regard to the privacy preferences of each client. Then, we optimize the LDP by allocating less privacy budget to gradients with higher PSI in a layer-wise manner, thus providing a stricter privacy guarantee for PSI. Furthermore, to mitigate the performance degradation caused by LDP, we design a parameter aggregation mechanism based on the distribution of the perturbed information. We conducted experiments with text tamper detection on T-SROIE and DocTamper datasets, and FedRE achieves competitive performance compared to state-of-the-art methods.
- North America > United States > Illinois > Cook County > Chicago (0.05)
- Asia > China > Chongqing Province > Chongqing (0.05)
- Asia > China > Hubei Province > Wuhan (0.05)
- (3 more...)
SpinML: Customized Synthetic Data Generation for Private Training of Specialized ML Models
Zhang, Jiang, Sequeira, Rohan Xavier, Psounis, Konstantinos
Specialized machine learning (ML) models tailored to users needs and requests are increasingly being deployed on smart devices with cameras, to provide personalized intelligent services taking advantage of camera data. However, two primary challenges hinder the training of such models: the lack of publicly available labeled data suitable for specialized tasks and the inaccessibility of labeled private data due to concerns about user privacy. To address these challenges, we propose a novel system SpinML, where the server generates customized Synthetic image data to Privately traIN a specialized ML model tailored to the user request, with the usage of only a few sanitized reference images from the user. SpinML offers users fine-grained, object-level control over the reference images, which allows user to trade between the privacy and utility of the generated synthetic data according to their privacy preferences. Through experiments on three specialized model training tasks, we demonstrate that our proposed system can enhance the performance of specialized models without compromising users privacy preferences.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > California > Santa Clara County > Mountain View (0.04)
- Europe > Italy (0.04)
- Europe > Czechia > Prague (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.66)
Can Humans Oversee Agents to Prevent Privacy Leakage? A Study on Privacy Awareness, Preferences, and Trust in Language Model Agents
Zhang, Zhiping, Guo, Bingcan, Li, Tianshi
Language model (LM) agents that act on users' behalf for personal tasks can boost productivity, but are also susceptible to unintended privacy leakage risks. We present the first study on people's capacity to oversee the privacy implications of the LM agents. By conducting a task-based survey (N=300), we investigate how people react to and assess the response generated by LM agents for asynchronous interpersonal communication tasks, compared with a response they wrote. We found that people may favor the agent response with more privacy leakage over the response they drafted or consider both good, leading to an increased harmful disclosure from 15.7% to 55.0%. We further uncovered distinct patterns of privacy behaviors, attitudes, and preferences, and the nuanced interactions between privacy considerations and other factors. Our findings shed light on designing agentic systems that enable privacy-preserving interactions and achieve bidirectional alignment on privacy preferences to help users calibrate trust.
- North America > United States (0.04)
- Africa > Eswatini > Manzini > Manzini (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- Overview (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Communications > Social Media (1.00)
- (5 more...)
Locally Private Estimation with Public Features
Ma, Yuheng, Jia, Ke, Yang, Hanfang
We initiate the study of locally differentially private (LDP) learning with public features. We define semi-feature LDP, where some features are publicly available while the remaining ones, along with the label, require protection under local differential privacy. Under semi-feature LDP, we demonstrate that the mini-max convergence rate for non-parametric regression is significantly reduced compared to that of classical LDP. Then we propose HistOfTree, an estimator that fully leverages the information contained in both public and private features. Theoretically, HistOfTree reaches the mini-max optimal convergence rate. Empirically, HistOfTree achieves superior performance on both synthetic and real data. We also explore scenarios where users have the flexibility to select features for protection manually. In such cases, we propose an estimator and a data-driven parameter tuning strategy, leading to analogous theoretical and empirical results.
- Asia > China (0.04)
- North America > United States > New York (0.04)
- Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
- (5 more...)
- Information Technology > Security & Privacy (1.00)
- Law (0.68)
- Banking & Finance (0.67)
Cross-silo Federated Learning with Record-level Personalized Differential Privacy
Liu, Junxu, Lou, Jian, Xiong, Li, Liu, Jinfei, Meng, Xiaofeng
Federated learning enhanced by differential privacy has emerged as a popular approach to better safeguard the privacy of client-side data by protecting clients' contributions during the training process. Existing solutions typically assume a uniform privacy budget for all records and provide one-size-fits-all solutions that may not be adequate to meet each record's privacy requirement. In this paper, we explore the uncharted territory of cross-silo FL with record-level personalized differential privacy. We devise a novel framework named rPDP-FL, employing a two-stage hybrid sampling scheme with both client-level sampling and non-uniform record-level sampling to accommodate varying privacy requirements. A critical and non-trivial problem is to select the ideal per-record sampling probability q given the personalized privacy budget {\epsilon}. We introduce a versatile solution named Simulation-CurveFitting, allowing us to uncover a significant insight into the nonlinear correlation between q and {\epsilon} and derive an elegant mathematical model to tackle the problem. Our evaluation demonstrates that our solution can provide significant performance gains over the baselines that do not consider personalized privacy preservation.
- North America > United States > District of Columbia > Washington (0.05)
- Asia > China (0.04)
- North America > United States > New York > New York County > New York City (0.04)
User Consented Federated Recommender System Against Personalized Attribute Inference Attack
Recommender systems can be privacy-sensitive. To protect users' private historical interactions, federated learning has been proposed in distributed learning for user representations. Using federated recommender (FedRec) systems, users can train a shared recommendation model on local devices and prevent raw data transmissions and collections. However, the recommendation model learned by a common FedRec may still be vulnerable to private information leakage risks, particularly attribute inference attacks, which means that the attacker can easily infer users' personal attributes from the learned model. Additionally, traditional FedRecs seldom consider the diverse privacy preference of users, leading to difficulties in balancing the recommendation utility and privacy preservation. Consequently, FedRecs may suffer from unnecessary recommendation performance loss due to over-protection and private information leakage simultaneously. In this work, we propose a novel user-consented federated recommendation system (UC-FedRec) to flexibly satisfy the different privacy needs of users by paying a minimum recommendation accuracy price. UC-FedRec allows users to self-define their privacy preferences to meet various demands and makes recommendations with user consent. Experiments conducted on different real-world datasets demonstrate that our framework is more efficient and flexible compared to baselines.
- North America > Mexico > Yucatán > Mérida (0.05)
- Asia > China > Hong Kong (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)