AITopics

2503.14709

Country:

North America > United States > New Jersey (0.14)
North America > United States > Hawaii (0.14)
North America > United States > California (0.14)
(2 more...)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.88)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.48)

arXiv.org Artificial IntelligenceDec-16-2024

Privacy in Metalearning and Multitask Learning: Modeling and Separations

Aliakbarpour, Maryam, Bairaktari, Konstantina, Smith, Adam, Swanberg, Marika, Ullman, Jonathan

Model personalization allows a set of individuals, each facing a different learning task, to train models that are more accurate for each person than those they could develop individually. For example, consider a set of people, each of whom holds a relatively small dataset of photographs labeled with the names of their loved ones that appear in each picture. Each person would like to build a classifier that labels future pictures with the names of people in the picture, but training such an image classifier would take more data than any individual person has. Even though the tasks they want to carry out are different--their photos have different subjects--those tasks share a lot of common structure. By pooling their data, a large group of people could learn the shared components of a good set of classifiers. Each individual could then train the subject-specific components on their own, requiring only a few examples for each subject. Other applications of personalization include next-word prediction on a mobile keyboard, speech recognition, and recommendation systems. The goals of personalization are captured in a variety of formal frameworks, such as multitask learning and metalearning.

algorithm, artificial intelligence, machine learning, (17 more...)

2412.12374

Country: North America > United States (0.92)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.86)

arXiv.org Machine LearningDec-1-2024

Optimal Algorithms for Augmented Testing of Discrete Distributions

Aliakbarpour, Maryam, Indyk, Piotr, Rubinfeld, Ronitt, Silwal, Sandeep

We consider the problem of hypothesis testing for discrete distributions. In the standard model, where we have sample access to an underlying distribution $p$, extensive research has established optimal bounds for uniformity testing, identity testing (goodness of fit), and closeness testing (equivalence or two-sample testing). We explore these problems in a setting where a predicted data distribution, possibly derived from historical data or predictive machine learning models, is available. We demonstrate that such a predictor can indeed reduce the number of samples required for all three property testing tasks. The reduction in sample complexity depends directly on the predictor's quality, measured by its total variation distance from $p$. A key advantage of our algorithms is their adaptability to the precision of the prediction. Specifically, our algorithms can self-adjust their sample complexity based on the accuracy of the available prediction, operating without any prior knowledge of the estimation's accuracy (i.e. they are consistent). Additionally, we never use more samples than the standard approaches require, even if the predictions provide no meaningful information (i.e. they are also robust). We provide lower bounds to indicate that the improvements in sample complexity achieved by our algorithms are information-theoretically optimal. Furthermore, experimental results show that the performance of our algorithms on real data significantly exceeds our worst-case guarantees for sample complexity, demonstrating the practicality of our approach.

algorithm, artificial intelligence, machine learning, (15 more...)

2412.00974

Country:

North America > United States (1.00)
Europe (1.00)
North America > Canada (0.67)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningOct-23-2024

Enhancing Feature-Specific Data Protection via Bayesian Coordinate Differential Privacy

Aliakbarpour, Maryam, Chaudhuri, Syomantak, Courtade, Thomas A., Fallah, Alireza, Jordan, Michael I.

Local Differential Privacy (LDP) offers strong privacy guarantees without requiring users to trust external parties. However, LDP applies uniform protection to all data features, including less sensitive ones, which degrades performance of downstream tasks. To overcome this limitation, we propose a Bayesian framework, Bayesian Coordinate Differential Privacy (BCDP), that enables feature-specific privacy quantification. This more nuanced approach complements LDP by adjusting privacy protection according to the sensitivity of each feature, enabling improved performance of downstream tasks without compromising privacy. We characterize the properties of BCDP and articulate its connections with standard non-Bayesian privacy frameworks. We further apply our BCDP framework to the problems of private mean estimation and ordinary least-squares regression. The BCDP-based approach obtains improved accuracy compared to a purely LDP-based approach, without compromising on privacy.

artificial intelligence, data mining, machine learning, (19 more...)

2410.18404

Country:

Europe (0.67)
North America > United States > California (0.28)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
(2 more...)

arXiv.org Artificial IntelligenceDec-21-2023

Metalearning with Very Few Samples Per Task

Aliakbarpour, Maryam, Bairaktari, Konstantina, Brown, Gavin, Smith, Adam, Ullman, Jonathan

Metalearning and multitask learning are two frameworks for solving a group of related learning tasks more efficiently than we could hope to solve each of the individual tasks on their own. In multitask learning, we are given a fixed set of related learning tasks and need to output one accurate model per task, whereas in metalearning we are given tasks that are drawn i.i.d. from a metadistribution and need to output some common information that can be easily specialized to new, previously unseen tasks from the metadistribution. In this work, we consider a binary classification setting where tasks are related by a shared representation, that is, every task $P$ of interest can be solved by a classifier of the form $f_{P} \circ h$ where $h \in H$ is a map from features to some representation space that is shared across tasks, and $f_{P} \in F$ is a task-specific classifier from the representation space to labels. The main question we ask in this work is how much data do we need to metalearn a good representation? Here, the amount of data is measured in terms of both the number of tasks $t$ that we need to see and the number of samples $n$ per task. We focus on the regime where the number of samples per task is extremely small. Our main result shows that, in a distribution-free setting where the feature vectors are in $\mathbb{R}^d$, the representation is a linear map from $\mathbb{R}^d \to \mathbb{R}^k$, and the task-specific classifiers are halfspaces in $\mathbb{R}^k$, we can metalearn a representation with error $\varepsilon$ using just $n = k+2$ samples per task, and $d \cdot (1/\varepsilon)^{O(k)}$ tasks. Learning with so few samples per task is remarkable because metalearning would be impossible with $k+1$ samples per task, and because we cannot even hope to learn an accurate task-specific classifier with just $k+2$ samples per task.

artificial intelligence, machine learning, representation, (18 more...)

2312.13978

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

arXiv.org Artificial IntelligenceMay-22-2023

Differentially Private Medians and Interior Points for Non-Pathological Data

Aliakbarpour, Maryam, Silver, Rose, Steinke, Thomas, Ullman, Jonathan

A statistical estimator is an algorithm that takes data drawn from an unknown distribution as input and tries to learn something about that distribution. While the input data is only a conduit for learning about the distribution, many statistical estimators also reveal a lot of information that is specific to the input data, which raises concerns about the privacy of people who contributed their data. In response, we can try to design estimators that are differentially private (DP) [DMNS06], which ensure that no attacker can infer much more about any person in the input data than they could have inferred in a hypothetical world where that person's data had never been collected. Differential privacy is a strong constraint that imposes significant costs even for very simple statistical estimation tasks. In this paper we focus on two such tasks: interior point estimation and median estimation. In the interior point problem, we have a distribution overR, and our goal is simply to output somepoint with inf support() sup support().

artificial intelligence, machine learning, probability, (17 more...)

2305.1344

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

arXiv.org Machine LearningFeb-1-2021

Local Differential Privacy Is Equivalent to Contraction of $E_\gamma$-Divergence

Asoodeh, Shahab, Aliakbarpour, Maryam, Calmon, Flavio P.

We investigate the local differential privacy (LDP) guarantees of a randomized privacy mechanism via its contraction properties. We first show that LDP constraints can be equivalently cast in terms of the contraction coefficient of the $E_\gamma$-divergence. We then use this equivalent formula to express LDP guarantees of privacy mechanisms in terms of contraction coefficients of arbitrary $f$-divergences. When combined with standard estimation-theoretic tools (such as Le Cam's and Fano's converse methods), this result allows us to study the trade-off between privacy and utility in several testing and minimax and Bayesian estimation problems.

artificial intelligence, bayesian inference, mechanism, (17 more...)

2102.01258

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

arXiv.org Machine LearningOct-6-2020

Testing Tail Weight of a Distribution Via Hazard Rate

Aliakbarpour, Maryam, Biswas, Amartya Shankha, Ravichandran, Kavya, Rubinfeld, Ronitt

For many computational problems, the distribution of data affects the algorithms we choose to solve them. One important attribute of a distribution is the shape of its "tail", that is, the behavior of the distribution as it moves away from the mean. In some distributions, the frequencies of elements far from the mean drop very quickly ("light-tailed"), while in others, the frequencies of large elements drop more slowly ("heavy-tailed"); consequently, in such "heavy-tailed" distributions, more samples are needed to see important elements. For example, Harchol-Balter in [HB99] has shown that the performances of policies for scheduling computing jobs vary dramatically depending on whether the distribution of the job workloads are heavy-tailed or light-tailed. The shape of the distribution has influenced the design and analysis of learning algorithms - three recent examples include the classification algorithm of [WRH17], the generalization bound of [Fel20] and the frequency estimation result of [HIKV18] (where the latter two are for long-tailed and Zipfian distributions respectively, both subclasses of heavy-tailed distributions).

artificial intelligence, machine learning, statistic, (16 more...)

2010.02888

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningAug-9-2020

Testing Determinantal Point Processes

Gatmiry, Khashayar, Aliakbarpour, Maryam, Jegelka, Stefanie

Determinantal point processes (DPPs) are a rich class of discrete probability distributions that were first studied in the context of quantum physics [54] and random matrix theory [30]. Initiated by the seminal work of Kulesza and Taskar [46], DPPs have gained a lot of attention in machine learning, due to their ability to naturally capture notions of diversity and repulsion. Moreover, they are easy to define via a similarity (kernel) matrix, and, as opposed to many other probabilistic models, offer tractable exact algorithms for marginalization, conditioning and sampling [5, 42, 46, 51]. Therefore, DPPs have been explored in a wide range of applications, including video summarization [38, 39], image search [2, 45], document and timeline summarization [53], recommendation [69], feature selection in bioinformatics [9], modeling neurons [63], and matrix approximation [22, 23, 50]. A Determinantal Point Process is a distribution over the subsets of a ground set [n] {1,2,...n}, and parameterized by a marginal kernel matrix K R

artificial intelligence, dpp, machine learning, (14 more...)

2008.0365

Country: North America > United States > Virginia (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

arXiv.org Machine LearningJul-6-2019

Testing Mixtures of Discrete Distributions

Aliakbarpour, Maryam, Kumar, Ravi, Rubinfeld, Ronitt

There has been significant study on the sample complexity of testing properties of distributions over large domains. For many properties, it is known that the sample complexity can be substantially smaller than the domain size. For example, over a domain of size $n$, distinguishing the uniform distribution from distributions that are far from uniform in $\ell_1$-distance uses only $O(\sqrt{n})$ samples. However, the picture is very different in the presence of arbitrary noise, even when the amount of noise is quite small. In this case, one must distinguish if samples are coming from a distribution that is $\epsilon$-close to uniform from the case where the distribution is $(1-\epsilon)$-far from uniform. The latter task requires nearly linear in $n$ samples [Valiant 2008, Valian and Valiant 2011]. In this work, we present a noise model that on one hand is more tractable for the testing problem, and on the other hand represents a rich class of noise families. In our model, the noisy distribution is a mixture of the original distribution and noise, where the latter is known to the tester either explicitly or via sample access; the form of the noise is also known a priori. Focusing on the identity and closeness testing problems leads to the following mixture testing question: Given samples of distributions $p, q_1,q_2$, can we test if $p$ is a mixture of $q_1$ and $q_2$? We consider this general question in various scenarios that differ in terms of how the tester can access the distributions, and show that indeed this problem is more tractable. Our results show that the sample complexity of our testers are exactly the same as for the classical non-mixture case.

artificial intelligence, machine learning, probability, (18 more...)

1907.0319

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)