AITopics | Nguyen, Van-Anh

Collaborating Authors

Nguyen, Van-Anh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Why Domain Generalization Fail? A View of Necessity and Sufficiency

Vuong, Long-Tung, Vo, Vy, Dang, Hien, Nguyen, Van-Anh, Do, Thanh-Toan, Harandi, Mehrtash, Le, Trung, Phung, Dinh

arXiv.org Machine LearningFeb-15-2025

Despite a strong theoretical foundation, empirical experiments reveal that existing domain generalization (DG) algorithms often fail to consistently outperform the ERM baseline. We argue that this issue arises because most DG studies focus on establishing theoretical guarantees for generalization under unrealistic assumptions, such as the availability of sufficient, diverse (or even infinite) domains or access to target domain knowledge. As a result, the extent to which domain generalization is achievable in scenarios with limited domains remains largely unexplored. This paper seeks to address this gap by examining generalization through the lens of the conditions necessary for its existence and learnability. Specifically, we systematically establish a set of necessary and sufficient conditions for generalization. Our analysis highlights that existing DG methods primarily act as regularization mechanisms focused on satisfying sufficient conditions, while often neglecting necessary ones. However, sufficient conditions cannot be verified in settings with limited training domains. In such cases, regularization targeting sufficient conditions aims to maximize the likelihood of generalization, whereas regularization targeting necessary conditions ensures its existence. Using this analysis, we reveal the shortcomings of existing DG algorithms by showing that, while they promote sufficient conditions, they inadvertently violate necessary conditions. To validate our theoretical insights, we propose a practical method that promotes the sufficient condition while maintaining the necessary conditions through a novel subspace representation alignment strategy. This approach highlights the advantages of preserving the necessary conditions on well-established DG benchmarks.

artificial intelligence, generalization, machine learning, (14 more...)

arXiv.org Machine Learning

2502.10716

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Agnostic Sharpness-Aware Minimization

Nguyen, Van-Anh, Tran, Quyen, Truong, Tuan, Do, Thanh-Toan, Phung, Dinh, Le, Trung

arXiv.org Artificial IntelligenceJun-11-2024

Sharpness-aware minimization (SAM) has been instrumental in improving deep neural network training by minimizing both the training loss and the sharpness of the loss landscape, leading the model into flatter minima that are associated with better generalization properties. In another aspect, Model-Agnostic Meta-Learning (MAML) is a framework designed to improve the adaptability of models. MAML optimizes a set of meta-models that are specifically tailored for quick adaptation to multiple tasks with minimal fine-tuning steps and can generalize well with limited data. In this work, we explore the connection between SAM and MAML, particularly in terms of enhancing model generalization. We introduce Agnostic-SAM, a novel approach that combines the principles of both SAM and MAML. Agnostic-SAM adapts the core idea of SAM by optimizing the model towards wider local minima using training data, while concurrently maintaining low loss values on validation data. By doing so, it seeks flatter minima that are not only robust to small perturbations but also less vulnerable to data distributional shift problems. Our experimental results demonstrate that Agnostic-SAM significantly improves generalization over baselines across a range of datasets and under challenging conditions such as noisy labels and data limitation.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2406.07107

Country:

North America > Canada (0.68)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Flat Seeking Bayesian Neural Networks

Nguyen, Van-Anh, Vuong, Tung-Long, Phan, Hoang, Do, Thanh-Toan, Phung, Dinh, Le, Trung

arXiv.org Artificial IntelligenceNov-6-2023

Bayesian Neural Networks (BNNs) provide a probabilistic interpretation for deep learning models by imposing a prior distribution over model parameters and inferring a posterior distribution based on observed data. The model sampled from the posterior distribution can be used for providing ensemble predictions and quantifying prediction uncertainty. It is well-known that deep learning models with lower sharpness have better generalization ability. However, existing posterior inferences are not aware of sharpness/flatness in terms of formulation, possibly leading to high sharpness for the models sampled from them. In this paper, we develop theories, the Bayesian setting, and the variational inference approach for the sharpness-aware posterior. Specifically, the models sampled from our sharpness-aware posterior, and the optimal approximate posterior estimating this sharpness-aware posterior, have better flatness, hence possibly possessing higher generalization ability. We conduct experiments by leveraging the sharpness-aware posterior with state-of-the-art Bayesian Neural Networks, showing that the flat-seeking counterparts outperform their baselines in all metrics of interest.

artificial intelligence, bayesian neural network, deep learning, (1 more...)

arXiv.org Artificial Intelligence

2302.02713

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Optimal Transport Model Distributional Robustness

Nguyen, Van-Anh, Le, Trung, Bui, Anh Tuan, Do, Thanh-Toan, Phung, Dinh

arXiv.org Artificial IntelligenceNov-1-2023

Distributional robustness is a promising framework for training deep learning models that are less vulnerable to adversarial examples and data distribution shifts. Previous works have mainly focused on exploiting distributional robustness in the data space. In this work, we explore an optimal transport-based distributional robustness framework in model spaces. Specifically, we examine a model distribution within a Wasserstein ball centered on a given model distribution that maximizes the loss. We have developed theories that enable us to learn the optimal robust center model distribution. Interestingly, our developed theories allow us to flexibly incorporate the concept of sharpness awareness into training, whether it's a single model, ensemble models, or Bayesian Neural Networks, by considering specific forms of the center model distribution. These forms include a Dirac delta distribution over a single model, a uniform distribution over several models, and a general Bayesian Neural Network. Furthermore, we demonstrate that Sharpness-Aware Minimization (SAM) is a specific case of our framework when using a Dirac delta distribution over a single model, while our framework can be seen as a probabilistic extension of SAM. To validate the effectiveness of our framework in the aforementioned settings, we conducted extensive experiments, and the results reveal remarkable improvements compared to the baselines.

artificial intelligence, deep learning, optimal transport model distributional robustness, (1 more...)

arXiv.org Artificial Intelligence

2306.04178

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback