AITopics

2303.13588

Country: North America > United States > Wisconsin (0.28)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Artificial IntelligenceFeb-28-2023

The Trade-off between Universality and Label Efficiency of Representations from Contrastive Learning

Shi, Zhenmei, Chen, Jiefeng, Li, Kunyang, Raghuram, Jayaram, Wu, Xi, Liang, Yingyu, Jha, Somesh

Pre-training representations (a.k.a. foundation models) has recently become a prevalent learning paradigm, where one first pre-trains a representation using large-scale unlabeled data, and then learns simple predictors on top of the representation using small labeled data from the downstream tasks. There are two key desiderata for the representation: label efficiency (the ability to learn an accurate classifier on top of the representation with a small amount of labeled data) and universality (usefulness across a wide range of downstream tasks). In this paper, we focus on one of the most popular instantiations of this paradigm: contrastive learning with linear probing, i.e., learning a linear predictor on the representation pre-trained by contrastive learning. We show that there exists a trade-off between the two desiderata so that one may not be able to achieve both simultaneously. Specifically, we provide analysis using a theoretical data model and show that, while more diverse pre-training data result in more diverse features for different tasks (improving universality), it puts less emphasis on task-specific features, giving rise to larger sample complexity for down-stream supervised tasks, and thus worse prediction performance. Guided by this analysis, we propose a contrastive regularization method to improve the trade-off. We validate our analysis and method empirically with systematic experiments using real-world datasets and foundation models.

artificial intelligence, machine learning, test accuracy, (14 more...)

2303.00106

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Government (0.45)
Leisure & Entertainment (0.45)
Energy > Oil & Gas (0.34)
Education > Educational Setting > Online (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceJan-26-2023

Learning Modulo Theories

Fredrikson, Matt, Lu, Kaiji, Vijayakumar, Saranya, Jha, Somesh, Ganesh, Vijay, Wang, Zifan

Recent techniques that integrate \emph{solver layers} into Deep Neural Networks (DNNs) have shown promise in bridging a long-standing gap between inductive learning and symbolic reasoning techniques. In this paper we present a set of techniques for integrating \emph{Satisfiability Modulo Theories} (SMT) solvers into the forward and backward passes of a deep network layer, called SMTLayer. Using this approach, one can encode rich domain knowledge into the network in the form of mathematical formulas. In the forward pass, the solver uses symbols produced by prior layers, along with these formulas, to construct inferences; in the backward pass, the solver informs updates to the network, driving it towards representations that are compatible with the solver's theory. Notably, the solver need not be differentiable. We implement \layername as a Pytorch module, and our empirical results show that it leads to models that \emph{1)} require fewer training samples than conventional models, \emph{2)} that are robust to certain types of covariate shift, and \emph{3)} that ultimately learn representations that are consistent with symbolic knowledge, and thus naturally interpretable.

artificial intelligence, deep learning, machine learning, (19 more...)

2301.11435

Country: North America > United States > Wisconsin (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceNov-23-2022

Private Multi-Winner Voting for Machine Learning

Dziedzic, Adam, Choquette-Choo, Christopher A, Dullerud, Natalie, Suriyakumar, Vinith Menon, Shamsabadi, Ali Shahin, Kaleem, Muhammad Ahmad, Jha, Somesh, Papernot, Nicolas, Wang, Xiao

Private multi-winner voting is the task of revealing $k$-hot binary vectors satisfying a bounded differential privacy (DP) guarantee. This task has been understudied in machine learning literature despite its prevalence in many domains such as healthcare. We propose three new DP multi-winner mechanisms: Binary, $\tau$, and Powerset voting. Binary voting operates independently per label through composition. $\tau$ voting bounds votes optimally in their $\ell_2$ norm for tight data-independent guarantees. Powerset voting operates over the entire binary vector by viewing the possible outcomes as a power set. Our theoretical and empirical analysis shows that Binary voting can be a competitive mechanism on many tasks unless there are strong correlations between labels, in which case Powerset voting outperforms it. We use our mechanisms to enable privacy-preserving multi-label learning in the central setting by extending the canonical single-label technique: PATE. We find that our techniques outperform current state-of-the-art approaches on large, real-world healthcare data and standard multi-label benchmarks. We further enable multi-label confidential and private collaborative (CaPC) learning and show that model performance can be significantly improved in the multi-site setting.

data mining, machine learning, mechanism, (17 more...)

2211.1541

Country:

Europe (0.67)
North America > United States > California (0.45)
North America > Canada > Ontario (0.28)

Genre: Research Report > Promising Solution (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

arXiv.org Artificial IntelligenceAug-5-2021

Fairness Properties of Face Recognition and Obfuscation Systems

Rosenberg, Harrison, Tang, Brian, Fawaz, Kassem, Jha, Somesh

The proliferation of automated facial recognition in various commercial and government sectors has caused significant privacy concerns for individuals. A recent and popular approach to address these privacy concerns is to employ evasion attacks against the metric embedding networks powering facial recognition systems. Face obfuscation systems generate imperceptible perturbations, when added to an image, cause the facial recognition system to misidentify the user. The key to these approaches is the generation of perturbations using a pre-trained metric embedding network followed by their application to an online system, whose model might be proprietary. This dependence of face obfuscation on metric embedding networks, which are known to be unfair in the context of facial recognition, surfaces the question of demographic fairness -- \textit{are there demographic disparities in the performance of face obfuscation systems?} To address this question, we perform an analytical and empirical exploration of the performance of recent face obfuscation systems that rely on deep embedding networks. We find that metric embedding networks are demographically aware; they cluster faces in the embedding space based on their demographic attributes. We observe that this effect carries through to the face obfuscation systems: faces belonging to minority groups incur reduced utility compared to those from majority groups. For example, the disparity in average obfuscation success rate on the online Face++ API can reach up to 20 percentage points. Further, for some demographic groups, the average perturbation size increases by up to 17\% when choosing a target identity belonging to a different demographic group versus the same demographic group. Finally, we present a simple analytical model to provide insights into these phenomena.

artificial intelligence, dataset, demographic group, (15 more...)

2108.02707

Country:

North America > United States > Wisconsin (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.93)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)

arXiv.org Machine LearningAug-2-2021

Domain Adaptation for Autoencoder-Based End-to-End Communication Over Wireless Channels

Raghuram, Jayaram, Zeng, Yijing, Martí, Dolores García, Jha, Somesh, Banerjee, Suman, Widmer, Joerg, Ortiz, Rafael Ruiz

The problem of domain adaptation conventionally considers the setting where a source domain has plenty of labeled data, and a target domain (with a different data distribution) has plenty of unlabeled data but none or very limited labeled data. In this paper, we address the setting where the target domain has only limited labeled data from a distribution that is expected to change frequently. We first propose a fast and light-weight method for adapting a Gaussian mixture density network (MDN) using only a small set of target domain samples. This method is well-suited for the setting where the distribution of target data changes rapidly (e.g., a wireless channel), making it challenging to collect a large number of samples and retrain. We then apply the proposed MDN adaptation method to the problem of end-of-end learning of a wireless communication autoencoder. A communication autoencoder models the encoder, decoder, and the channel using neural networks, and learns them jointly to minimize the overall decoding error rate. However, the error rate of an autoencoder trained on a particular (source) channel distribution can degrade as the channel distribution changes frequently, not allowing enough time for data collection and retraining of the autoencoder to the target channel distribution. We propose a method for adapting the autoencoder without modifying the encoder and decoder neural networks, and adapting only the MDN model of the channel. The method utilizes feature transformations at the decoder to compensate for changes in the channel distribution, and effectively present to the decoder samples close to the source distribution. Experimental evaluation on simulated datasets and real mmWave wireless channels demonstrate that the proposed methods can quickly adapt the MDN model, and improve or maintain the error rate of the autoencoder under changing channel conditions.

autoencoder, bayesian inference, neural network, (18 more...)

2108.00874

Country:

Europe (0.28)
North America > United States > Wisconsin (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.74)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

arXiv.org Artificial IntelligenceFeb-14-2021

Exploring Adversarial Robustness of Deep Metric Learning

Panum, Thomas Kobber, Wang, Zi, Kan, Pengyu, Fernandes, Earlence, Jha, Somesh

Deep Metric Learning (DML), a widely-used technique, involves learning a distance metric between Traditional deep learning classifiers are vulnerable to adversarial pairs of samples. DML uses deep neural examples (Szegedy et al., 2014; Biggio et al., architectures to learn semantic embeddings 2013) -- inconspicuous input changes that can cause the of the input, where the distance between similar model to output attacker-desired values. Few studies have examples is small while dissimilar ones are far addressed whether DML models are similarly susceptible apart. Although the underlying neural networks towards these attacks, and the results are contradictory produce good accuracy on naturally occurring (Abdelnabi et al., 2020; Panum et al., 2020). Given samples, they are vulnerable to adversariallyperturbed the wide usage of DML models in diverse ML tasks, including samples that reduce performance. We security-oriented ones, it is important to clarify take a first step towards training robust DML their susceptibility towards attacks and ultimately address models and tackle the primary challenge of the their lack of robustness. We investigate the vulnerability of metric losses being dependent on the samples DML towards these attacks and address the open problem in a mini-batch, unlike standard losses that only of training DML models using robust optimization techniques depend on the specific input-output pair.

deep learning, dml model, neural network, (13 more...)

2102.07265

Country: North America > United States > Wisconsin > Dane County > Madison (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

arXiv.org Machine LearningFeb-8-2021

Interval Universal Approximation for Neural Networks

Wang, Zi, Albarghouthi, Aws, Prakriya, Gautam, Jha, Somesh

To certify safety and robustness of neural networks, researchers have successfully applied abstract interpretation, primarily using interval bound propagation (IBP). IBP is an incomplete calculus that over-approximates the set of possible predictions of a neural network. In this paper, we introduce the interval universal approximation (IUA) theorem, which sheds light on the power and limits of IBP. First, IUA shows that neural networks not only can approximate any continuous function $f$ (universal approximation) as we have known for decades, but we can find a neural network, using any well-behaved activation function, whose interval bounds are an arbitrary close approximation of the set semantics of $f$ (the result of applying $f$ to a set of inputs). We call this notion of approximation interval approximation. Our result (1) extends the recent result of Baader et al. (2020) from ReLUs to a rich class of activation functions that we call squashable functions, and (2) implies that we can construct certifiably robust neural networks under $\ell_\infty$-norm using almost any practical activation function. Our construction and that of Baader et al. (2020) are exponential in the size of the function's domain. The IUA theorem additionally establishes a limit on the capabilities of IBP. Specifically, we show that there is no efficient construction of a neural network that interval-approximates any $f$, unless P=NP. To do so, we present a novel reduction from 3SAT to interval-approximation of neural networks. It implies that it is hard to construct an IBP-certifiably robust network, even if we have a robust network to start with.

artificial intelligence, deep learning, neural network, (15 more...)

2007.06093

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > California > Los Angeles County (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningDec-19-2020

Sample Complexity of Adversarially Robust Linear Classification on Separated Data

Bhattacharjee, Robi, Jha, Somesh, Chaudhuri, Kamalika

We consider the sample complexity of learning with adversarial robustness. Most prior theoretical results for this problem have considered a setting where different classes in the data are close together or overlapping. Motivated by some real applications, we consider, in contrast, the well-separated case where there exists a classifier with perfect accuracy and robustness, and show that the sample complexity narrates an entirely different story. Specifically, for linear classifiers, we show a large class of well-separated distributions where the expected robust loss of any algorithm is at least $\Omega(\frac{d}{n})$, whereas the max margin algorithm has expected standard loss $O(\frac{1}{n})$. This shows a gap in the standard and robust losses that cannot be obtained via prior techniques. Additionally, we present an algorithm that, given an instance where the robustness radius is much smaller than the gap between the classes, gives a solution with expected robust loss is $O(\frac{1}{n})$. This shows that for very well-separated data, convergence rates of $O(\frac{1}{n})$ are achievable, which is not the case otherwise. Our results apply to robustness measured in any $\ell_p$ norm with $p > 1$ (including $p = \infty$).

artificial intelligence, classifier, neural network, (11 more...)

2012.10794

Country:

Europe (1.00)
North America > United States > California (0.28)

Genre:

Research Report (0.84)
Workflow (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

arXiv.org Machine LearningOct-29-2020

Robustness against Relational Adversary

Wang, Yizhen, Meng, Xiaozhu, Wang, Ke, Christodorescu, Mihai, Jha, Somesh

Test-time adversarial attacks have posed serious challenges to the robustness of machine-learning models, and in many settings the adversarial perturbation need not be bounded by small $\ell_p$-norms. Motivated by the semantics-preserving attacks in vision and security domain, we investigate $\textit{relational adversaries}$, a broad class of attackers who create adversarial examples that are in a reflexive-transitive closure of a logical relation. We analyze the conditions for robustness and propose $\textit{normalize-and-predict}$ -- a learning framework with provable robustness guarantee. We compare our approach with adversarial training and derive an unified framework that provides benefits of both approaches. Guided by our theoretical findings, we apply our framework to image classification and malware detection. Results of both tasks show that attacks using relational adversaries frequently fool existing models, but our unified framework can significantly enhance their robustness.

accuracy, deep learning, neural network, (18 more...)

2007.00772

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)