AITopics | Nguyen, Thanh

Collaborating Authors

Nguyen, Thanh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Code Preference via Synthetic Evolution

Liu, Jiawei, Nguyen, Thanh, Shang, Mingyue, Ding, Hantian, Li, Xiaopeng, Yu, Yu, Kumar, Varun, Wang, Zijian

arXiv.org Artificial IntelligenceOct-23-2024

Large Language Models (LLMs) have recently demonstrated remarkable coding capabilities. However, assessing code generation based on well-formed properties and aligning it with developer preferences remains challenging. In this paper, we explore two key questions under the new challenge of code preference learning: (i) How do we train models to predict meaningful preferences for code? and (ii) How do human and LLM preferences align with verifiable code properties and developer code tastes? Furthermore, we discover the prohibitive costs and limitations of human-based code preference: despite spending 23.4 person-minutes on each task, 15.1 40.3% of tasks remain unsolved. Compared to model-based preference, human preference tends to be more accurate under the objective of code correctness, while being sub-optimal for non-functional objectives. Large Language Models (LLMs) for code (Chen et al., 2021; GitHub, 2023; Amazon Web Services, 2023) have become instrumental in modern software development. Code LLMs assist developers in various scenarios, from suggesting code completions and generating functional code based on user instructions to proposing complex code changes to resolve bug reports and feature requests. Instruction-tuned LLMs (Luo et al., 2024; Wei et al., 2024) are increasingly adept at generating functional code based on natural language instructions. However, evaluating the quality of LLM-generated code remains challenging, particularly regarding code correctness, efficiency, security, adherence to best practices, and alignment with developer preferences. Effectively and efficiently assessing LLM-generated code against these properties is crucial for both evaluation (Liu et al., 2023b) and preference optimization for code LLMs (Weyssow et al., 2024). Nevertheless, the subject of learning code preferences has been largely under-explored, motivating us to study code preferences systematically and train code preference models with new data and modeling methods. Following the established format in LLM-as-a-judge (Chiang et al., 2024), we define the code preference task as follows: Given a user query, a pair of two candidate code responses, and optionally a preference criterion, code preference is demonstrated by choosing one response over the other. Work done during a research internship at AWS AI Labs. Code execution: Code preference in another way can be confidently determined by execution statuses (Liu et al., 2023a). However, applying code execution to arbitrary programs poses challenges due to (i) setup complexity, (ii) code incompleteness, and (iii) execution overhead.

code preference, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2410.03837

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology (0.87)
Education > Curriculum > Subject-Specific Education (0.60)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mitigating Adversarial Perturbations for Deep Reinforcement Learning via Vector Quantization

Luu, Tung M., Nguyen, Thanh, Jin, Tee Joshua Tian, Kim, Sungwoon, Yoo, Chang D.

arXiv.org Artificial IntelligenceOct-4-2024

Recent studies reveal that well-performing reinforcement learning (RL) agents in training often lack resilience against adversarial perturbations during deployment. This highlights the importance of building a robust agent before deploying it in the real world. Most prior works focus on developing robust training-based procedures to tackle this problem, including enhancing the robustness of the deep neural network component itself or adversarially training the agent on strong attacks. In this work, we instead study an input transformation-based defense for RL. Specifically, we propose using a variant of vector quantization (VQ) as a transformation for input observations, which is then used to reduce the space of adversarial attacks during testing, resulting in the transformed observations being less affected by attacks. Our method is computationally efficient and seamlessly integrates with adversarial training, further enhancing the robustness of RL agents against adversarial attacks. Through extensive experiments in multiple environments, we demonstrate that using VQ as the input transformation effectively defends against adversarial attacks on the agent's observations.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2410.03376

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Towards Robust Policy: Enhancing Offline Reinforcement Learning with Adversarial Attacks and Defenses

Nguyen, Thanh, Luu, Tung M., Ton, Tri, Yoo, Chang D.

arXiv.org Artificial IntelligenceMay-18-2024

Offline reinforcement learning (RL) addresses the challenge of expensive and high-risk data exploration inherent in RL by pre-training policies on vast amounts of offline data, enabling direct deployment or fine-tuning in real-world environments. However, this training paradigm can compromise policy robustness, leading to degraded performance in practical conditions due to observation perturbations or intentional attacks. While adversarial attacks and defenses have been extensively studied in deep learning, their application in offline RL is limited. This paper proposes a framework to enhance the robustness of offline RL models by leveraging advanced adversarial attacks and defenses. The framework attacks the actor and critic components by perturbing observations during training and using adversarial defenses as regularization to enhance the learned policy. Four attacks and two defenses are introduced and evaluated on the D4RL benchmark. The results show the vulnerability of both the actor and critic to attacks and the effectiveness of the defenses in improving policy robustness. This framework holds promise for enhancing the reliability of offline RL models in practical scenarios.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2405.11206

Genre: Research Report > New Finding (0.87)

Industry:

Information Technology > Security & Privacy (0.95)
Government > Military (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

DimCL: Dimensional Contrastive Learning For Improving Self-Supervised Learning

Nguyen, Thanh, Pham, Trung, Zhang, Chaoning, Luu, Tung, Vu, Thang, Yoo, Chang D.

arXiv.org Artificial IntelligenceSep-21-2023

Self-supervised learning (SSL) has gained remarkable success, for which contrastive learning (CL) plays a key role. However, the recent development of new non-CL frameworks has achieved comparable or better performance with high improvement potential, prompting researchers to enhance these frameworks further. Assimilating CL into non-CL frameworks has been thought to be beneficial, but empirical evidence indicates no visible improvements. In view of that, this paper proposes a strategy of performing CL along the dimensional direction instead of along the batch direction as done in conventional contrastive learning, named Dimensional Contrastive Learning (DimCL). DimCL aims to enhance the feature diversity, and it can serve as a regularizer to prior SSL frameworks. DimCL has been found to be effective, and the hardness-aware property is identified as a critical reason for its success. Extensive experimental results reveal that assimilating DimCL into SSL frameworks leads to performance improvement by a non-trivial margin on various datasets and backbone architectures.

artificial intelligence, dimcl, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ACCESS.2023.3236087

2309.11782

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)

Add feedback

Regret-Based Optimization for Robust Reinforcement Learning

Belaire, Roman, Varakantham, Pradeep, Nguyen, Thanh, Lo, David

arXiv.org Artificial IntelligenceAug-23-2023

Deep Reinforcement Learning (DRL) policies have been shown to be vulnerable to small adversarial noise in observations. Such adversarial noise can have disastrous consequences in safety-critical environments. For instance, a self-driving car receiving adversarially perturbed sensory observations about nearby signs (e.g., a stop sign physically altered to be perceived as a speed limit sign) or objects (e.g., cars altered to be recognized as trees) can be fatal. Existing approaches for making RL algorithms robust to an observation-perturbing adversary have focused on reactive approaches that iteratively improve against adversarial examples generated at each iteration. While such approaches have been shown to provide improvements over regular RL methods, they are reactive and can fare significantly worse if certain categories of adversarial examples are not generated during training. To that end, we pursue a more proactive approach that relies on directly optimizing a well-studied robustness measure, regret instead of expected value. We provide a principled approach that minimizes maximum regret over a "neighborhood" of observations to the received "observation". Our regret criterion can be used to modify existing value- and policy-based Deep RL methods. We demonstrate that our approaches provide a significant improvement in performance across a wide variety of benchmarks against leading approaches for robust Deep RL.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2302.06912

Genre: Research Report (0.64)

Industry:

Transportation > Ground > Road (1.00)
Information Technology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Beyond NaN: Resiliency of Optimization Layers in The Face of Infeasibility

Wong, Wai Tuck, Kinsey, Sarah, Karunasena, Ramesha, Nguyen, Thanh, Sinha, Arunesh

arXiv.org Artificial IntelligenceFeb-4-2023

Prior work has successfully incorporated optimization layers as the last layer in neural networks for various problems, thereby allowing joint learning and planning in one neural network forward pass. In this work, we identify a weakness in such a set-up where inputs to the optimization layer lead to undefined output of the neural network. Such undefined decision outputs can lead to possible catastrophic outcomes in critical real time applications. We show that an adversary can cause such failures by forcing rank deficiency on the matrix fed to the optimization layer which results in the optimization failing to produce a solution. We provide a defense for the failure cases by controlling the condition number of the input matrix. We study the problem in the settings of synthetic data, Jigsaw Sudoku, and in speed planning for autonomous driving, building on top of prior frameworks in end-to-end learning and optimization. We show that our proposed defense effectively prevents the framework from failing with undefined output. Finally, we surface a number of edge cases which lead to serious bugs in popular equation and optimization solvers which can be abused as well.

artificial intelligence, machine learning, matrix, (16 more...)

arXiv.org Artificial Intelligence

2202.06242

Country:

Europe (0.67)
North America > United States > California (0.46)

Genre: Research Report > New Finding (0.93)

Industry:

Energy > Oil & Gas > Upstream (1.00)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Robust MAML: Prioritization task buffer with adaptive learning process for model-agnostic meta-learning

Nguyen, Thanh, Luu, Tung, Pham, Trung, Rakhimkul, Sanzhar, Yoo, Chang D.

arXiv.org Artificial IntelligenceMar-15-2021

Model agnostic meta-learning (MAML) is a popular state-of-the-art meta-learning algorithm that provides good weight initialization of a model given a variety of learning tasks. The model initialized by provided weight can be fine-tuned to an unseen task despite only using a small amount of samples and within a few adaptation steps. MAML is simple and versatile but requires costly learning rate tuning and careful design of the task distribution which affects its scalability and generalization. This paper proposes a more robust MAML based on an adaptive learning scheme and a prioritization task buffer(PTB) referred to as Robust MAML (RMAML) for improving scalability of training process and alleviating the problem of distribution mismatch. RMAML uses gradient-based hyper-parameter optimization to automatically find the optimal learning rate and uses the PTB to gradually adjust train-ing task distribution toward testing task distribution over the course of training. Experimental results on meta reinforcement learning environments demonstrate a substantial performance gain as well as being less sensitive to hyper-parameter choice and robust to distribution mismatch.

computer based training, educational technology, task distribution, (18 more...)

arXiv.org Artificial Intelligence

2103.08233

Genre: Research Report (0.40)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Sample-efficient Reinforcement Learning Representation Learning with Curiosity Contrastive Forward Dynamics Model

Nguyen, Thanh, Luu, Tung M., Vu, Thang, Yoo, Chang D.

arXiv.org Artificial IntelligenceMar-15-2021

Developing an agent in reinforcement learning (RL) that is capable of performing complex control tasks directly from high-dimensional observation such as raw pixels is yet a challenge as efforts are made towards improving sample efficiency and generalization. This paper considers a learning framework for Curiosity Contrastive Forward Dynamics Model (CCFDM) in achieving a more sample-efficient RL based directly on raw pixels. CCFDM incorporates a forward dynamics model (FDM) and performs contrastive learning to train its deep convolutional neural network-based image encoder (IE) to extract conducive spatial and temporal information for achieving a more sample efficiency for RL. In addition, during training, CCFDM provides intrinsic rewards, produced based on FDM prediction error, encourages the curiosity of the RL agent to improve exploration. The diverge and less-repetitive observations provide by both our exploration strategy and data augmentation available in contrastive learning improve not only the sample efficiency but also the generalization. Performance of existing model-free RL methods such as Soft Actor-Critic built on top of CCFDM outperforms prior state-of-the-art pixel-based RL methods on the DeepMind Control Suite benchmark.

computer game, deep learning, sample efficiency, (16 more...)

arXiv.org Artificial Intelligence

2103.08255

Country: Asia (0.14)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Bayesian Optimization with Unknown Search Space

Ha, Huong, Rana, Santu, Gupta, Sunil, Nguyen, Thanh, Tran-The, Hung, Venkatesh, Svetha

arXiv.org Machine LearningOct-29-2019

Applying Bayesian optimization in problems wherein the search space is unknown is challenging. To address this problem, we propose a systematic volume expansion strategy for the Bayesian optimization. We devise a strategy to guarantee that in iterative expansions of the search space, our method can find a point whose function value within epsilon of the objective function maximum. Without the need to specify any parameters, our algorithm automatically triggers a minimal expansion required iteratively. We derive analytic expressions for when to trigger the expansion and by how much to expand. We also provide theoretical analysis to show that our method achieves epsilon-accuracy after a finite number of iterations. We demonstrate our method on both benchmark test functions and machine learning hyper-parameter tuning tasks and demonstrate that our method outperforms baselines.

neural network, null, optimization problem, (19 more...)

arXiv.org Machine Learning

1910.13092

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.87)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

BUZz: BUffer Zones for defending adversarial examples in image classification

Nguyen, Phuong Ha, Mahmood, Kaleel, Nguyen, Lam M., Nguyen, Thanh, van Dijk, Marten

arXiv.org Machine LearningOct-3-2019

BUZ Z: BU FFER Z ONES FOR DEFENDING ADVERSAR - IAL EXAMPLES IN IMAGE CLASSIFICATION Phuong Ha Nguyen 1, Kaleel Mahmood 1, Lam M. Nguyen 2, Thanh Nguyen 3, Marten van Dijk 1,4 1 Department of Electrical and Computer Engineering, University of Connecticut, USA 2 IBM Research, Thomas J. Watson Research Center, Y orktown Heights, USA 3 Iowa State University, USA 4 CWI Amsterdam, The Netherlands Equally contributed phuongha.ntu@gmail.com, Abstract We propose a novel defense against all existing gradient based adversarial attacks on deep neural networks for image classification problems. Our defense is based on a combination of deep neural networks and simple image transformations. While straight forward in implementation, this defense yields a unique security property which we term buffer zones. We argue that our defense based on buffer zones is secure against state-of-the-art black box attacks. We are able to achieve this security even when the adversary has access to the entire ...

black-box attack, deep learning, neural network, (17 more...)

arXiv.org Machine Learning

1910.02785

Country:

North America > United States > Iowa (0.24)
North America > United States > Connecticut (0.24)
Europe > Netherlands > North Holland > Amsterdam (0.24)

Genre:

Summary/Review (0.67)
Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback