AITopics

2501.12666

Country: Europe > Spain (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Artificial IntelligenceOct-10-2024

On Barycenter Computation: Semi-Unbalanced Optimal Transport-based Method on Gaussians

Nguyen, Ngoc-Hai, Le, Dung, Nguyen, Hoang-Phi, Pham, Tung, Ho, Nhat

We explore a robust version of the barycenter problem among $n$ centered Gaussian probability measures, termed Semi-Unbalanced Optimal Transport (SUOT)-based Barycenter, wherein the barycenter remains fixed while the others are relaxed using Kullback-Leibler divergence. We develop optimization algorithms on Bures-Wasserstein manifold, named the Exact Geodesic Gradient Descent and Hybrid Gradient Descent algorithms. While the Exact Geodesic Gradient Descent method is based on computing the exact closed form of the first-order derivative of the objective function of the barycenter along a geodesic on the Bures manifold, the Hybrid Gradient Descent method utilizes optimizer components when solving the SUOT problem to replace outlier measures before applying the Riemannian Gradient Descent. We establish the theoretical convergence guarantees for both methods and demonstrate that the Exact Geodesic Gradient Descent algorithm attains a dimension-free convergence rate. Finally, we conduct experiments to compare the normal Wasserstein Barycenter with ours and perform an ablation study.

artificial intelligence, machine learning, matrix, (16 more...)

2410.08117

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

arXiv.org Machine LearningMar-19-2024

Diversity-Aware Agnostic Ensemble of Sharpness Minimizers

Bui, Anh, Vo, Vy, Pham, Tung, Phung, Dinh, Le, Trung

There has long been plenty of theoretical and empirical evidence supporting the success of ensemble learning. Deep ensembles in particular take advantage of training randomness and expressivity of individual neural networks to gain prediction diversity, ultimately leading to better generalization, robustness and uncertainty estimation. In respect of generalization, it is found that pursuing wider local minima result in models being more robust to shifts between training and testing sets. A natural research question arises out of these two approaches as to whether a boost in generalization ability can be achieved if ensemble learning and loss sharpness minimization are integrated. Our work investigates this connection and proposes DASH - a learning algorithm that promotes diversity and flatness within deep ensembles. More concretely, DASH encourages base learners to move divergently towards low-loss regions of minimal sharpness. We provide a theoretical backbone for our method along with extensive empirical evidence demonstrating an improvement in ensemble generalizability.

artificial intelligence, ensemble, machine learning, (13 more...)

2403.13204

Country:

North America > United States > California (0.14)
Europe > Netherlands (0.14)

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

arXiv.org Artificial IntelligenceNov-16-2023

Robust Contrastive Learning With Theory Guarantee

Tran, Ngoc N., Tran, Lam, Phan, Hoang, Bui, Anh, Pham, Tung, Tran, Toan, Phung, Dinh, Le, Trung

Contrastive learning (CL) allows us to create meaningful features without any label information. In the first phase, CL approaches learn the features, which are then classified by a linear classifier that has been learned from labeled data. While existing theoretical works have studied the connection between the supervised loss in the second phase and the unsupervised loss in the first phase to explain why the unsupervised loss can support the supervised loss, there has been no theoretical examination of the connection between the unsupervised loss in the first phase and the robust supervised loss in the second phase, which can shed light on how to establish an effective unsupervised loss in the first phase. To fill this gap, our paper develops rigorous theories to identify which components in the supervised loss can aid the robust supervised loss. Finally, we conduct experiments to verify our findings. All code used in this work is available at https://anonymous.4open.science/r/rosa.

artificial intelligence, exp, machine learning, (17 more...)

2311.09671

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceSep-30-2023

Understanding the Robustness of Randomized Feature Defense Against Query-Based Adversarial Attacks

Nguyen, Quang H., Lao, Yingjie, Pham, Tung, Wong, Kok-Seng, Doan, Khoa D.

Recent works have shown that deep neural networks are vulnerable to adversarial examples that find samples close to the original image but can make the model misclassify. Even with access only to the model's output, an attacker can employ black-box attacks to generate such adversarial examples. In this work, we propose a simple and lightweight defense against black-box attacks by adding random noise to hidden features at intermediate layers of the model at inference time. Our theoretical analysis confirms that this method effectively enhances the model's resilience against both score-based and decision-based black-box attacks. Importantly, our defense does not necessitate adversarial training and has minimal impact on accuracy, rendering it applicable to any pre-trained model. Our analysis also reveals the significance of selectively adding noise to different parts of the model based on the gradient of the adversarial objective function, which can be varied during the attack. We demonstrate the robustness of our defense against multiple black-box attacks through extensive empirical experiments involving diverse models with various architectures.

artificial intelligence, machine learning, robustness, (19 more...)

2310.00567

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

arXiv.org Artificial IntelligenceSep-29-2023

RSAM: Learning on manifolds with Riemannian Sharpness-aware Minimization

Truong, Tuan, Nguyen, Hoang-Phi, Pham, Tung, Tran, Minh-Tuan, Harandi, Mehrtash, Phung, Dinh, Le, Trung

Nowadays, understanding the geometry of the loss landscape shows promise in enhancing a model's generalization ability. In this work, we draw upon prior works that apply geometric principles to optimization and present a novel approach to improve robustness and generalization ability for constrained optimization problems. Indeed, this paper aims to generalize the Sharpness-Aware Minimization (SAM) optimizer to Riemannian manifolds. In doing so, we first extend the concept of sharpness and introduce a novel notion of sharpness on manifolds. To support this notion of sharpness, we present a theoretical analysis characterizing generalization capabilities with respect to manifold sharpness, which demonstrates a tighter bound on the generalization gap, a result not known before. Motivated by this analysis, we introduce our algorithm, Riemannian Sharpness-Aware Minimization (RSAM). To demonstrate RSAM's ability to enhance generalization ability, we evaluate and contrast our algorithm on a broad set of problems, such as image classification and contrastive learning across different datasets, including CIFAR100, CIFAR10, and FGVCAircraft. Our code is publicly available at \url{https://t.ly/RiemannianSAM}.

machine learning, manifold, natural language, (18 more...)

2309.17215

Country: Europe (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Natural Language (0.93)

arXiv.org Machine LearningAug-24-2021

Entropic Gromov-Wasserstein between Gaussian Distributions

Le, Khang, Le, Dung, Nguyen, Huy, Do, Dat, Pham, Tung, Ho, Nhat

We study the entropic Gromov-Wasserstein and its unbalanced version between (unbalanced) Gaussian distributions with different dimensions. When the metric is the inner product, which we refer to as inner product Gromov-Wasserstein (IGW), we demonstrate that the optimal transportation plans of entropic IGW and its unbalanced variant are (unbalanced) Gaussian distributions. Via an application of von Neumann's trace inequality, we obtain closed-form expressions for the entropic IGW between these Gaussian distributions. Finally, we consider an entropic inner product Gromov-Wasserstein barycenter of multiple Gaussian distributions. We prove that the barycenter is Gaussian distribution when the entropic regularization parameter is small. We further derive closed-form expressions for the covariance matrix of the barycenter.

artificial intelligence, gaussian distribution, modeling & simulation, (18 more...)

2108.10961

Country:

North America > United States > Texas (0.14)
North America > United States > Michigan (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

arXiv.org Machine LearningAug-22-2021

An Efficient Mini-batch Method via Partial Transportation

Nguyen, Khai, Nguyen, Dang, Pham, Tung, Ho, Nhat

Mini-batch optimal transport (m-OT) has been widely used recently to deal with the memory issue of OT in large-scale applications. Despite their practicality, m-OT suffers from misspecified mappings, namely, mappings that are optimal on the mini-batch level but do not exist in the optimal transportation plan between the original measures. To address the misspecified mappings issue, we propose a novel mini-batch method by using partial optimal transport (POT) between mini-batch empirical measures, which we refer to as mini-batch partial optimal transport (m-POT). Leveraging the insight from the partial transportation, we explain the source of misspecified mappings from the m-OT and motivate why limiting the amount of transported masses among mini-batches via POT can alleviate the incorrect mappings. Finally, we carry out extensive experiments on various applications to compare m-POT with m-OT and recently proposed mini-batch method, mini-batch unbalanced optimal transport (m-UOT). We observe that m-POT is better than m-OT deep domain adaptation applications while having comparable performance with m-UOT. On other applications, such as deep generative model, gradient flow, and color transfer, m-POT yields more favorable performance than both m-OT and m-UOT.

deep learning, neural network, optimal transport, (19 more...)

2108.09645

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

arXiv.org Machine LearningAug-18-2021

On Multimarginal Partial Optimal Transport: Equivalent Forms and Computational Complexity

Le, Khang, Nguyen, Huy, Pham, Tung, Ho, Nhat

The recent advances in computation of optimal transport (OT) [34, 11, 24, 12, 1] has brought new applications of optimal transport in machine learning and data science to the fore. Examples of these applications include generative models [2, 33, 16, 10], unsupervised learning [17, 18], computer vision [32, 25], and other applications [29, 27, 6]. However, due to the marginal constraints of transportation plans, optimal transport is only defined between balanced measures, namely, measures with equal masses. When measures are unbalanced, i.e., they can have different masses, there are two popular approaches for defining divergences between these measures. The first approach is unbalanced optimal transport [9, 8]. The main idea of unbalanced optimal transport is to regularize the objective function of optimal transport based on certain divergences between marginal constraints of transportation plan and the masses of measures. Despite its favorable computational complexity [28] and practical applications [31, 14, 19, 3], the optimal transportation plan from unbalanced optimal transport is often non-trivial to interpret in practice. The second approach for defining divergence between unbalanced measures is partial optimal transport (POT) [5, 13], which was originally used to analyze partial differential equations.

artificial intelligence, optimal transport, optimization problem, (16 more...)

2108.07992

Country:

Asia (0.14)
North America > United States > Texas (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.34)

arXiv.org Machine LearningFeb-12-2021

On Robust Optimal Transport: Computational Complexity, Low-rank Approximation, and Barycenter Computation

Le, Khang, Nguyen, Huy, Nguyen, Quang, Ho, Nhat, Pham, Tung, Bui, Hung

The recent advance in computation with optimal transport (OT) problem [12, 3, 13, 7, 20, 23, 17, 18] has led to a surge of interest in using that tool in various domains of machine learning and statistics. The range of its applications is broad, including deep generative models [4, 14, 32], scalable Bayes [29, 30], mixture and hierarchical models [21], and other applications [28, 25, 10, 15, 33, 31, 8]. The goal of optimal transport is to find a minimal cost of moving masses between (supports of) probability distributions. It is known that the estimation of transport cost is not robust when there are outliers. To deal with this issue, [34] proposed a trimmed version of optimal transport. In particular, they search for the truncated probability distributions such that the optimal transport cost between them is minimized. However, their trimmed optimal transport is non-trivial to compute, which hinders its usage in practical applications. Another line of works proposed using unbalanced optimal transport (UOT) to solve the sensitivity of optimal transport to outliers [5, 26]. More specifically, their idea is to assign as small as possible mass to outliers by relaxing the marginal constraints of OT through a penalty function such as the KL divergence.

neural network, rsot, survey article, (19 more...)

2102.06857

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.65)