AITopics | Nguyen, Phuong Ha

Collaborating Authors

Nguyen, Phuong Ha

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Generalizing DP-SGD with Shuffling and Batch Clipping

van Dijk, Marten, Nguyen, Phuong Ha, Nguyen, Toan N., Nguyen, Lam M.

arXiv.org Artificial IntelligenceJul-25-2023

Classical differential private DP-SGD implements individual clipping with random subsampling, which forces a mini-batch SGD approach. We provide a general differential private algorithmic framework that goes beyond DP-SGD and allows any possible first order optimizers (e.g., classical SGD and momentum based SGD approaches) in combination with batch clipping, which clips an aggregate of computed gradients rather than summing clipped gradients (as is done in individual clipping). The framework also admits sampling techniques beyond random subsampling such as shuffling. Our DP analysis follows the $f$-DP approach and introduces a new proof technique which allows us to derive simple closed form expressions and to also analyse group privacy. In particular, for $E$ epochs work and groups of size $g$, we show a $\sqrt{g E}$ DP dependency for batch clipping with shuffling.

artificial intelligence, machine learning, trade-off function, (17 more...)

arXiv.org Artificial Intelligence

2212.05796

Country: North America > United States > Nevada (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.45)

Add feedback

Batch Clipping and Adaptive Layerwise Clipping for Differential Private Stochastic Gradient Descent

Nguyen, Toan N., Nguyen, Phuong Ha, Nguyen, Lam M., Van Dijk, Marten

arXiv.org Artificial IntelligenceJul-21-2023

Each round in Differential Private Stochastic Gradient Descent (DPSGD) transmits a sum of clipped gradients obfuscated with Gaussian noise to a central server which uses this to update a global model which often represents a deep neural network. Since the clipped gradients are computed separately, which we call Individual Clipping (IC), deep neural networks like resnet-18 cannot use Batch Normalization Layers (BNL) which is a crucial component in deep neural networks for achieving a high accuracy. To utilize BNL, we introduce Batch Clipping (BC) where, instead of clipping single gradients as in the orginal DPSGD, we average and clip batches of gradients. Moreover, the model entries of different layers have different sensitivities to the added Gaussian noise. Therefore, Adaptive Layerwise Clipping methods (ALC), where each layer has its own adaptively finetuned clipping constant, have been introduced and studied, but so far without rigorous DP proofs. In this paper, we propose {\em a new ALC and provide rigorous DP proofs for both BC and ALC}. Experiments show that our modified DPSGD with BC and ALC for CIFAR-$10$ with resnet-$18$ converges while DPSGD with IC and ALC does not.

accuracy, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2307.11939

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Considerations on the Theory of Training Models with Differential Privacy

van Dijk, Marten, Nguyen, Phuong Ha

arXiv.org Artificial IntelligenceJul-16-2023

Privacy leakage is a big problem in the big-data era. Solving a learning task based on big data intrinsically means that only through a collaborative effort sufficient data is available for training a global model with sufficient clean accuracy (utility). Federated learning is a framework where a learning task is solved by a loose federation of participating devices/clients which are coordinated by a central server [42, 8, 3, 33, 40, 9, 57, 29, 59, 10, 34, 36, 37, 39, 12, 30]. Clients, who use own local data to participate in a learning task by training a global model, want to have privacy guarantees for their local proprietary data. For this reason DP-SGD [1] was introduced as it adapts distributed Stochastic Gradient Descent (SGD)[55] with Differential Privacy (DP)[19, 15, 21, 18].

artificial intelligence, machine learning, privacy, (18 more...)

arXiv.org Artificial Intelligence

2303.04676

Country: North America > United States > Nevada (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Differential Private Hogwild! over Distributed Local Data Sets

van Dijk, Marten, Nguyen, Nhuong V., Nguyen, Toan N., Nguyen, Lam M., Nguyen, Phuong Ha

arXiv.org Machine LearningFeb-17-2021

The objective is to minimize a loss function with respect to model parameters w. This problem is known as empirical risk minimization and it covers a wide range of convex and non-convex problems from the ML domain, including, but not limited to, logistic regression, multi-kernel learning, conditional random fields and neural networks. We want to solve (1) in a distributed setting where many clients have their own local data sets and the finite-sum minimization problem is over the collection of all local data sets.

accuracy, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

2102.0903

Country:

North America > United States (0.14)
Europe > Netherlands (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Hogwild! over Distributed Local Data Sets with Linearly Increasing Mini-Batch Sizes

van Dijk, Marten, Nguyen, Nhuong V., Nguyen, Toan N., Nguyen, Lam M., Tran-Dinh, Quoc, Nguyen, Phuong Ha

arXiv.org Machine LearningOct-26-2020

Hogwild! implements asynchronous Stochastic Gradient Descent (SGD) where multiple threads in parallel access a common repository containing training data, perform SGD iterations and update shared state that represents a jointly learned (global) model. We consider big data analysis where training data is distributed among local data sets -- and we wish to move SGD computations to local compute nodes where local data resides. The results of these local SGD computations are aggregated by a central "aggregator" which mimics Hogwild!. We show how local compute nodes can start choosing small mini-batch sizes which increase to larger ones in order to reduce communication cost (round interaction with the aggregator). We prove a tight and novel non-trivial convergence analysis for strongly convex problems which does not use the bounded gradient assumption as seen in many existing publications. The tightness is a consequence of our proofs for lower and upper bounds of the convergence rate, which show a constant factor difference. We show experimental results for plain convex and non-convex problems for biased and unbiased local data sets.

artificial intelligence, machine learning, size sequence, (14 more...)

arXiv.org Machine Learning

2010.14763

Country: North America > United States > North Carolina (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology (0.68)
Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)

Add feedback

Asynchronous Federated Learning with Reduced Number of Rounds and with Differential Privacy from Less Aggregated Gaussian Noise

van Dijk, Marten, Nguyen, Nhuong V., Nguyen, Toan N., Nguyen, Lam M., Tran-Dinh, Quoc, Nguyen, Phuong Ha

arXiv.org Machine LearningJul-17-2020

The feasibility of federated learning is highly constrained by the server-clients infrastructure in terms of network communication. Most newly launched smartphones and IoT devices are equipped with GPUs or sufficient computing hardware to run powerful AI models. However, in case of the original synchronous federated learning, client devices suffer waiting times and regular communication between clients and server is required. This implies more sensitivity to local model training times and irregular or missed updates, hence, less or limited scalability to large numbers of clients and convergence rates measured in real time will suffer. We propose a new algorithm for asynchronous federated learning which eliminates waiting times and reduces overall network communication - we provide rigorous theoretical analysis for strongly convex objective functions and provide simulation results. By adding Gaussian noise we show how our algorithm can be made differentially private -- new theorems show how the aggregated added Gaussian noise is significantly reduced.

artificial intelligence, optimization problem, size sequence, (18 more...)

arXiv.org Machine Learning

2007.09208

Country: North America > United States > North Carolina (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Beware the Black-Box: on the Robustness of Recent Defenses to Adversarial Examples

Mahmood, Kaleel, Gurevin, Deniz, van Dijk, Marten, Nguyen, Phuong Ha

arXiv.org Machine LearningJun-18-2020

Recent defenses published at venues like NIPS, ICML, ICLR and CVPR are mainly focused on mitigating white-box attacks. These defenses do not properly consider adaptive adversaries. In this paper, we expand the scope of these defenses to include adaptive black-box adversaries. Our evaluation is done on nine defenses including Barrage of Random Transforms, ComDefend, Ensemble Diversity, Feature Distillation, The Odds are Odd, Error Correcting Codes, Distribution Classifier Defense, K-Winner Take All and Buffer Zones. Our investigation is done using two black-box adversarial models and six widely studied adversarial attacks for CIFAR-10 and Fashion-MNIST datasets. Our analyses show most recent defenses provide only marginal improvements in security, as compared to undefended networks. Based on these results, we propose new standards for properly evaluating defenses to black-box adversaries. We provide this security framework to assist researchers in developing future black-box resistant models.

accuracy, air transportation, deep learning, (20 more...)

arXiv.org Machine Learning

2006.10876

Country:

North America > United States (0.14)
Europe > Netherlands (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Transportation > Air (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Vision (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

BUZz: BUffer Zones for defending adversarial examples in image classification

Nguyen, Phuong Ha, Mahmood, Kaleel, Nguyen, Lam M., Nguyen, Thanh, van Dijk, Marten

arXiv.org Machine LearningOct-3-2019

BUZ Z: BU FFER Z ONES FOR DEFENDING ADVERSAR - IAL EXAMPLES IN IMAGE CLASSIFICATION Phuong Ha Nguyen 1, Kaleel Mahmood 1, Lam M. Nguyen 2, Thanh Nguyen 3, Marten van Dijk 1,4 1 Department of Electrical and Computer Engineering, University of Connecticut, USA 2 IBM Research, Thomas J. Watson Research Center, Y orktown Heights, USA 3 Iowa State University, USA 4 CWI Amsterdam, The Netherlands Equally contributed phuongha.ntu@gmail.com, Abstract We propose a novel defense against all existing gradient based adversarial attacks on deep neural networks for image classification problems. Our defense is based on a combination of deep neural networks and simple image transformations. While straight forward in implementation, this defense yields a unique security property which we term buffer zones. We argue that our defense based on buffer zones is secure against state-of-the-art black box attacks. We are able to achieve this security even when the adversary has access to the entire ...

black-box attack, deep learning, neural network, (17 more...)

arXiv.org Machine Learning

1910.02785

Country:

North America > United States > Iowa (0.24)
North America > United States > Connecticut (0.24)
Europe > Netherlands > North Holland > Amsterdam (0.24)

Genre:

Summary/Review (0.67)
Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

DTN: A Learning Rate Scheme with Convergence Rate of $\mathcal{O}(1/t)$ for SGD

Nguyen, Lam M., Nguyen, Phuong Ha, Phan, Dzung T., Kalagnanam, Jayant R., van Dijk, Marten

arXiv.org Machine LearningJan-28-2019

We propose a novel diminishing learning rate scheme, coined Decreasing-Trend-Nature (DTN), which allows us to prove fast convergence of the Stochastic Gradient Descent (SGD) algorithm to a first-order stationary point for smooth general convex and some class of nonconvex including neural network applications for classification problems. We are the first to prove that SGD with diminishing learning rate achieves a convergence rate of $\mathcal{O}(1/t)$ for these problems. Our theory applies to neural network applications for classification problems in a straightforward way.

convergence rate, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

1901.07634

Country: North America > United States (0.28)

Genre: Research Report (0.83)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Optimal Finite-Sum Smooth Non-Convex Optimization with SARAH

Nguyen, Lam M., van Dijk, Marten, Phan, Dzung T., Nguyen, Phuong Ha, Weng, Tsui-Wei, Kalagnanam, Jayant R.

arXiv.org Machine LearningJan-22-2019

The total complexity (measured as the total number of gradient computations) of a stochastic first-order optimization algorithm that finds a first-order stationary point of a finite-sum smooth nonconvex objective function $F(w)=\frac{1}{n} \sum_{i=1}^n f_i(w)$ has been proven to be at least $\Omega(\sqrt{n}/\epsilon)$ where $\epsilon$ denotes the attained accuracy $\mathbb{E}[ \|\nabla F(\tilde{w})\|^2] \leq \epsilon$ for the outputted approximation $\tilde{w}$ (Fang et al.,2018). This paper is the first to show that this lower bound is tight for the class of variance reduction methods which only assume the Lipschitz continuous gradient assumption. We prove this complexity result for a slightly modified version of the SARAH algorithm in (Nguyen et al.,2017a;b) - showing that SARAH is optimal and dominates all existing results. For convex optimization, we propose SARAH++ with sublinear convergence for general convex and linear convergence for strongly convex problems; and we provide a practical version for which numerical experiments on various datasets show an improved performance.

algorithm, artificial intelligence, optimization problem, (14 more...)

arXiv.org Machine Learning

1901.07648

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback