private information
TOHAN: A One-step Approach towards Few-shot Hypothesis Adaptation
In few-shot domain adaptation (FDA), classifiers for the target domain are trained with \emph{accessible} labeled data in the source domain (SD) and few labeled data in the target domain (TD). However, data usually contain private information in the current era, e.g., data distributed on personal phones. Thus, the private data will be leaked if we directly access data in SD to train a target-domain classifier (required by FDA methods). In this paper, to prevent privacy leakage in SD, we consider a very challenging problem setting, where the classifier for the TD has to be trained using few labeled target data and a well-trained SD classifier, named few-shot hypothesis adaptation (FHA). In FHA, we cannot access data in SD, as a result, the private information in SD will be protected well. To this end, we propose a target-oriented hypothesis adaptation network (TOHAN) to solve the FHA problem, where we generate highly-compatible unlabeled data (i.e., an intermediate domain) to help train a target-domain classifier. TOHAN maintains two deep networks simultaneously, in which one focuses on learning an intermediate domain and the other takes care of the intermediate-to-target distributional adaptation and the target-risk minimization. Experimental results show that TOHAN outperforms competitive baselines significantly.
- North America > United States > Virginia (0.04)
- North America > United States > Maryland (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Ohio > Cuyahoga County > Cleveland (0.04)
- North America > United States > Maryland > Prince George's County > College Park (0.04)
- (2 more...)
- Research Report > Experimental Study (1.00)
- Workflow (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)
Partially Encrypted Machine Learning using Functional Encryption
We graciously thank the reviewers for their helpful comments. We clarify some details of the article below. In fact, this article shows that even if FE isn't as mature as homomorphic We do detail and reference many notions from cryptology. ML community may not be familiar with those new concepts, and we sought to introduce them carefully and rigorously. In return, classical notions of ML do not need to be referenced as much because they are well established.
- Europe > Germany (0.28)
- North America > Canada > British Columbia (0.04)
- North America > United States > Virginia (0.04)
- (18 more...)
- Research Report > New Finding (0.67)
- Research Report > Experimental Study (0.67)
- Law > Statutes (1.00)
- Law > Litigation (1.00)
- Law > International Law (1.00)
- (10 more...)
- North America > United States > Hawaii (0.04)
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
open questions like, lower bounds, private information, and real-valued feedback, pointed out by reviewers
We thank reviewers for detailed comments and suggestions. We will address all comments in the revision. AIStats'19) considered the problem of learning an optimal action but ignored the contextual information. In this work, we incorporated the contextual information, which is readily available in many applications. The idea might look incremental.
Instance-Adaptive Hypothesis Tests with Heterogeneous Agents
Shi, Flora C., Wainwright, Martin J., Bates, Stephen
We study hypothesis testing over a heterogeneous population of strategic agents with private information. Any single test applied uniformly across the population yields statistical error that is sub-optimal relative to the performance of an oracle given access to the private information. We show how it is possible to design menus of statistical contracts that pair type-optimal tests with payoff structures, inducing agents to self-select according to their private information. This separating menu elicits agent types and enables the principal to match the oracle performance even without a priori knowledge of the agent type. Our main result fully characterizes the collection of all separating menus that are instance-adaptive, matching oracle performance for an arbitrary population of heterogeneous agents. We identify designs where information elicitation is essentially costless, requiring negligible additional expense relative to a single-test benchmark, while improving statistical performance. Our work establishes a connection between proper scoring rules and menu design, showing how the structure of the hypothesis test constrains the elicitable information. Numerical examples illustrate the geometry of separating menus and the improvements they deliver in error trade-offs. Overall, our results connect statistical decision theory with mechanism design, demonstrating how heterogeneity and strategic participation can be harnessed to improve efficiency in hypothesis testing.
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Research Report > New Finding (0.66)
- Research Report > Experimental Study (0.46)
Controlling What You Share: Assessing Language Model Adherence to Privacy Preferences
Ramírez, Guillem, Birch, Alexandra, Titov, Ivan
Large language models (LLMs) are primarily accessed via commercial APIs, but this often requires users to expose their data to service providers. In this paper, we explore how users can stay in control of their data by using privacy profiles: simple natural language instructions that say what should and should not be revealed. We build a framework where a local model uses these instructions to rewrite queries, only hiding details deemed sensitive by the user, before sending them to an external model, thus balancing privacy with performance. To support this research, we introduce PEEP, a multilingual dataset of real user queries annotated to mark private content and paired with synthetic privacy profiles. Experiments with lightweight local LLMs show that, after fine-tuning, they not only achieve markedly better privacy preservation but also match or exceed the performance of much larger zero-shot models. At the same time, the system still faces challenges in fully adhering to user instructions, underscoring the need for models with a better understanding of user-defined privacy preferences.
- Europe > Austria > Vienna (0.14)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- North America > United States > Indiana > Bartholomew County > Columbus (0.04)
- (7 more...)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)