AITopics | Nguyen, Khang

Collaborating Authors

Nguyen, Khang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

FlowMP: Learning Motion Fields for Robot Planning with Conditional Flow Matching

Nguyen, Khang, Le, An T., Pham, Tien, Huber, Manfred, Peters, Jan, Vu, Minh Nhat

arXiv.org Artificial IntelligenceMar-8-2025

Prior flow matching methods in robotics have primarily learned velocity fields to morph one distribution of trajectories into another. In this work, we extend flow matching to capture second-order trajectory dynamics, incorporating acceleration effects either explicitly in the model or implicitly through the learning objective. Unlike diffusion models, which rely on a noisy forward process and iterative denoising steps, flow matching trains a continuous transformation (flow) that directly maps a simple prior distribution to the target trajectory distribution without any denoising procedure. By modeling trajectories with second-order dynamics, our approach ensures that generated robot motions are smooth and physically executable, avoiding the jerky or dynamically infeasible trajectories that first-order models might produce. We empirically demonstrate that this second-order conditional flow matching yields superior performance on motion planning benchmarks, achieving smoother trajectories and higher success rates than baseline planners. These findings highlight the advantage of learning acceleration-aware motion fields, as our method outperforms existing motion planning methods in terms of trajectory quality and planning success.

artificial intelligence, machine learning, trajectory, (15 more...)

arXiv.org Artificial Intelligence

2503.06135

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Volumetric Mapping with Panoptic Refinement via Kernel Density Estimation for Mobile Robots

Nguyen, Khang, Dang, Tuan, Huber, Manfred

arXiv.org Artificial IntelligenceDec-15-2024

Reconstructing three-dimensional (3D) scenes with semantic understanding is vital in many robotic applications. Robots need to identify which objects, along with their positions and shapes, to manipulate them precisely with given tasks. Mobile robots, especially, usually use lightweight networks to segment objects on RGB images and then localize them via depth maps; however, they often encounter out-of-distribution scenarios where masks over-cover the objects. In this paper, we address the problem of panoptic segmentation quality in 3D scene reconstruction by refining segmentation errors using non-parametric statistical methods. To enhance mask precision, we map the predicted masks into a depth frame to estimate their distribution via kernel densities. The outliers in depth perception are then rejected without the need for additional parameters in an adaptive manner to out-of-distribution scenarios, followed by 3D reconstruction using projective signed distance functions (SDFs). We validate our method on a synthetic dataset, which shows improvements in both quantitative and qualitative results for panoptic mapping. Through real-world testing, the results furthermore show our method's capability to be deployed on a real-robot system. Our source code is available at: https://github.com/mkhangg/refined panoptic mapping.

artificial intelligence, machine learning, refinement, (18 more...)

arXiv.org Artificial Intelligence

2412.11241

Country:

North America > United States > Texas (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.61)

Add feedback

V3D-SLAM: Robust RGB-D SLAM in Dynamic Environments with 3D Semantic Geometry Voting

Dang, Tuan, Nguyen, Khang, Huber, Mandfred

arXiv.org Artificial IntelligenceOct-15-2024

Simultaneous localization and mapping (SLAM) in highly dynamic environments is challenging due to the correlation complexity between moving objects and the camera pose. Many methods have been proposed to deal with this problem; however, the moving properties of dynamic objects with a moving camera remain unclear. Therefore, to improve SLAM's performance, minimizing disruptive events of moving objects with a physical understanding of 3D shapes and dynamics of objects is needed. In this paper, we propose a robust method, V3D-SLAM, to remove moving objects via two lightweight re-evaluation stages, including identifying potentially moving and static objects using a spatial-reasoned Hough voting mechanism and refining static objects by detecting dynamic noise caused by intra-object motions using Chamfer distances as similarity measurements. Our experiment on the TUM RGB-D benchmark on dynamic sequences with ground-truth camera trajectories showed that our methods outperform the most recent state-of-the-art SLAM methods. Our source code is available at https://github.com/tuantdang/v3d-slam.

artificial intelligence, dynamic environment, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2410.12068

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Beyond the Visible: Jointly Attending to Spectral and Spatial Dimensions with HSI-Diffusion for the FINCH Spacecraft

Vyse, Ian, Dagli, Rishit, Chadha, Dav Vrat, Ma, John P., Chen, Hector, Ruparelia, Isha, Seran, Prithvi, Xie, Matthew, Aamer, Eesa, Armstrong, Aidan, Black, Naveen, Borstein, Ben, Caldwell, Kevin, Dahanaggamaarachchi, Orrin, Dai, Joe, Fatima, Abeer, Lu, Stephanie, Michet, Maxime, Paul, Anoushka, Po, Carrie Ann, Prakash, Shivesh, Prosser, Noa, Roy, Riddhiman, Shinjo, Mirai, Shofman, Iliya, Silayan, Coby, Sox-Harris, Reid, Zheng, Shuhan, Nguyen, Khang

arXiv.org Artificial IntelligenceJun-15-2024

Satellite remote sensing missions have gained popularity over the past fifteen years due to their ability to cover large swaths of land at regular intervals, making them ideal for monitoring environmental trends. The FINCH mission, a 3U+ CubeSat equipped with a hyperspectral camera, aims to monitor crop residue cover in agricultural fields. Although hyperspectral imaging captures both spectral and spatial information, it is prone to various types of noise, including random noise, stripe noise, and dead pixels. Effective denoising of these images is crucial for downstream scientific tasks. Traditional methods, including hand-crafted techniques encoding strong priors, learned 2D image denoising methods applied across different hyperspectral bands, or diffusion generative models applied independently on bands, often struggle with varying noise strengths across spectral bands, leading to significant spectral distortion. This paper presents a novel approach to hyperspectral image denoising using latent diffusion models that integrate spatial and spectral information. We particularly do so by building a 3D diffusion model and presenting a 3-stage training approach on real and synthetically crafted datasets. The proposed method preserves image structure while reducing noise. Evaluations on both popular hyperspectral denoising datasets and synthetically crafted datasets for the FINCH mission demonstrate the effectiveness of this approach.

artificial intelligence, hyperspectral image, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2406.10724

Country: North America > Canada > Ontario > Toronto (0.16)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

ExtPerFC: An Efficient 2D and 3D Perception Hardware-Software Framework for Mobile Cobot

Dang, Tuan, Nguyen, Khang, Huber, Manfred

arXiv.org Artificial IntelligenceJun-7-2023

As the reliability of the robot's perception correlates with the number of integrated sensing modalities to tackle uncertainty, a practical solution to manage these sensors from different computers, operate them simultaneously, and maintain their real-time performance on the existing robotic system with minimal effort is needed. In this work, we present an end-to-end software-hardware framework, namely ExtPerFC, that supports both conventional hardware and software components and integrates machine learning object detectors without requiring an additional dedicated graphic processor unit (GPU). We first design our framework to achieve real-time performance on the existing robotic system, guarantee configuration optimization, and concentrate on code reusability. We then mathematically model and utilize our transfer learning strategies for 2D object detection and fuse them into depth images for 3D depth estimation. Lastly, we systematically test the proposed framework on the Baxter robot with two 7-DOF arms, a four-wheel mobility base, and an Intel RealSense D435i RGB-D camera. The results show that the robot achieves real-time performance while executing other tasks (e.g., map building, localization, navigation, object detection, arm moving, and grasping) simultaneously with available hardware like Intel onboard CPUS/GPUs on distributed computers. Also, to comprehensively control, program, and monitor the robot system, we design and introduce an end-user application. The source code is available at https://github.com/tuantdang/perception_framework.

artificial intelligence, configuration, robot, (17 more...)

arXiv.org Artificial Intelligence

2306.04853

Country: Europe > Netherlands (0.14)

Genre: Research Report (0.70)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Revisiting Over-smoothing and Over-squashing Using Ollivier-Ricci Curvature

Nguyen, Khang, Nong, Hieu, Nguyen, Vinh, Ho, Nhat, Osher, Stanley, Nguyen, Tan

arXiv.org Artificial IntelligenceMay-31-2023

Graph Neural Networks (GNNs) had been demonstrated to be inherently susceptible to the problems of over-smoothing and over-squashing. These issues prohibit the ability of GNNs to model complex graph interactions by limiting their effectiveness in taking into account distant information. Our study reveals the key connection between the local graph geometry and the occurrence of both of these issues, thereby providing a unified framework for studying them at a local scale using the Ollivier-Ricci curvature. Specifically, we demonstrate that over-smoothing is linked to positive graph curvature while over-squashing is linked to negative graph curvature. Based on our theory, we propose the Batch Ollivier-Ricci Flow, a novel rewiring algorithm capable of simultaneously addressing both over-smoothing and over-squashing.

artificial intelligence, curvature, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2211.15779

Country: North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

UIT-OpenViIC: A Novel Benchmark for Evaluating Image Captioning in Vietnamese

Bui, Doanh C., Nguyen, Nghia Hieu, Nguyen, Khang

arXiv.org Artificial IntelligenceMay-9-2023

Image Captioning is one of the vision-language tasks that still interest the research community worldwide in the 2020s. MS-COCO Caption benchmark is commonly used to evaluate the performance of advanced captioning models, although it was published in 2015. Recent captioning models trained on the MS-COCO Caption dataset only have good performance in language patterns of English; they do not have such good performance in contexts captured in Vietnam or fluently caption images using Vietnamese. To contribute to the low-resources research community as in Vietnam, we introduce a novel image captioning dataset in Vietnamese, the Open-domain Vietnamese Image Captioning dataset (UIT-OpenViIC). The introduced dataset includes complex scenes captured in Vietnam and manually annotated by Vietnamese under strict rules and supervision. In this paper, we present in more detail the dataset creation process. From preliminary analysis, we show that our dataset is challenging to recent state-of-the-art (SOTA) Transformer-based baselines, which performed well on the MS COCO dataset. Then, the modest results prove that UIT-OpenViIC has room to grow, which can be one of the standard benchmarks in Vietnamese for the research community to evaluate their captioning models. Furthermore, we present a CAMO approach that effectively enhances the image representation ability by a multi-level encoder output fusion mechanism, which helps improve the quality of generated captions compared to previous captioning models.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2305.04166

Country:

Asia > Vietnam (0.75)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.35)

Add feedback

Face Swapping as A Simple Arithmetic Operation

Vu, Truong, Do, Kien, Nguyen, Khang, Than, Khoat

arXiv.org Artificial IntelligenceFeb-3-2023

We propose a novel high-fidelity face swapping method called "Arithmetic Face Swapping" (AFS) that explicitly disentangles the intermediate latent space W+ of a pretrained StyleGAN into the "identity" and "style" subspaces so that a latent code in W+ is the sum of an "identity" code and a "style" code in the corresponding subspaces. Via our disentanglement, face swapping (FS) can be regarded as a simple arithmetic operation in W+, i.e., the summation of a source "identity" code and a target "style" code. This makes AFS more intuitive and elegant than other FS methods. In addition, our method can generalize over the standard face swapping to support other interesting operations, e.g., combining the identity of one source with styles of multiple targets and vice versa. We implement our identity-style disentanglement by learning a neural network that maps a latent code to a "style" code. We provide a condition for this network which theoretically guarantees identity preservation of the source face even after a sequence of face swapping operations. Extensive experiments demonstrate the advantage of our method over state-of-the-art FS methods in producing high-quality swapped faces. Our source code was made public at https://github.com/truongvu2000nd/AFS

artificial intelligence, latent code, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2211.10812

Country: Asia > Vietnam (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.69)

Add feedback