AITopics | Pham, Lam

Collaborating Authors

Pham, Lam

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deepfake Audio Detection Using Spectrogram-based Feature and Ensemble of Deep Learning Models

Pham, Lam, Lam, Phat, Nguyen, Truong, Nguyen, Huyen, Schindler, Alexander

arXiv.org Artificial IntelligenceJul-1-2024

In this paper, we propose a deep learning based system for the task of deepfake audio detection. In particular, the draw input audio is first transformed into various spectrograms using three transformation methods of Short-time Fourier Transform (STFT), Constant-Q Transform (CQT), Wavelet Transform (WT) combined with different auditory-based filters of Mel, Gammatone, linear filters (LF), and discrete cosine transform (DCT). Given the spectrograms, we evaluate a wide range of classification models based on three deep learning approaches. The first approach is to train directly the spectrograms using our proposed baseline models of CNN-based model (CNN-baseline), RNN-based model (RNN-baseline), C-RNN model (C-RNN baseline). Meanwhile, the second approach is transfer learning from computer vision models such as ResNet-18, MobileNet-V3, EfficientNet-B0, DenseNet-121, SuffleNet-V2, Swint, Convnext-Tiny, GoogLeNet, MNASsnet, RegNet. In the third approach, we leverage the state-of-the-art audio pre-trained models of Whisper, Seamless, Speechbrain, and Pyannote to extract audio embeddings from the input spectrograms. Then, the audio embeddings are explored by a Multilayer perceptron (MLP) model to detect the fake or real audio samples. Finally, high-performance deep learning models from these approaches are fused to achieve the best performance. We evaluated our proposed models on ASVspoof 2019 benchmark dataset. Our best ensemble model achieved an Equal Error Rate (EER) of 0.03, which is highly competitive to top-performing systems in the ASVspoofing 2019 challenge. Experimental results also highlight the potential of selective spectrograms and deep learning approaches to enhance the task of audio deepfake detection.

artificial intelligence, machine learning, spectrogram, (14 more...)

arXiv.org Artificial Intelligence

2407.01777

Country:

Asia (0.28)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.96)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Toolchain for Comprehensive Audio/Video Analysis Using Deep Learning Based Multimodal Approach (A use case of riot or violent context detection)

Pham, Lam, Lam, Phat, Nguyen, Tin, Tang, Hieu, Schindler, Alexander

arXiv.org Artificial IntelligenceMay-2-2024

In this paper, we present a toolchain for a comprehensive audio/video analysis by leveraging deep learning based multimodal approach. To this end, different specific tasks of Speech to Text (S2T), Acoustic Scene Classification (ASC), Acoustic Event Detection (AED), Visual Object Detection (VOD), Image Captioning (IC), and Video Captioning (VC) are conducted and integrated into the toolchain. By combining individual tasks and analyzing both audio \& visual data extracted from input video, the toolchain offers various audio/video-based applications: Two general applications of audio/video clustering, comprehensive audio/video summary and a specific application of riot or violent context detection. Furthermore, the toolchain presents a flexible and adaptable architecture that is effective to integrate new models for further audio/video-based applications.

artificial intelligence, detection, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2407.0311

Country:

Europe > Austria > Vienna (0.15)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LSTM-based Deep Neural Network With A Focus on Sentence Representation for Sequential Sentence Classification in Medical Scientific Abstracts

Lam, Phat, Pham, Lam, Nguyen, Tin, Tang, Hieu, Michael, Seidl, Schindler, Alexander

arXiv.org Artificial IntelligenceJan-28-2024

The Sequential Sentence Classification task within the domain of medical abstracts, termed as SSC, involves the categorization of sentences into pre-defined headings based on their roles in conveying critical information in the abstract. In the SSC task, sentences are often sequentially related to each other. For this reason, the role of sentence embedding is crucial for capturing both the semantic information between words in the sentence and the contextual relationship of sentences within the abstract to provide a comprehensive representation for better classification. In this paper, we present a hierarchical deep learning model for the SSC task. First, we propose a LSTM-based network with multiple feature branches to create well-presented sentence embeddings at the sentence level. To perform the sequence of sentences, a convolutional-recurrent neural network (C-RNN) at the abstract level and a multi-layer perception network (MLP) at the segment level are developed that further enhance the model performance. Additionally, an ablation study is also conducted to evaluate the contribution of individual component in the entire network to the model performance at different levels. Our proposed system is very competitive to the state-of-the-art systems and further improve F1 scores of the baseline by 1.0%, 2.8%, and 2.6% on the benchmark datasets PudMed 200K RCT, PudMed 20K RCT and NICTA-PIBOSO, respectively.

artificial intelligence, classification, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2401.15854

Country: Oceania > Australia (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Landslide Detection and Segmentation Using Remote Sensing Images and Deep Neural Network

Le, Cam, Pham, Lam, Lampert, Jasmin, Schlögl, Matthias, Schindler, Alexander

arXiv.org Artificial IntelligenceDec-27-2023

Knowledge about historic landslide event occurrence is important for supporting disaster risk reduction strategies. Building upon findings from 2022 Landslide4Sense Competition, we propose a deep neural network based system for landslide detection and segmentation from multisource remote sensing image input. We use a U-Net trained with Cross Entropy loss as baseline model. We then improve the U-Net baseline model by leveraging a wide range of deep learning techniques. In particular, we conduct feature engineering by generating new band data from the original bands, which helps to enhance the quality of remote sensing image input. Regarding the network architecture, we replace traditional convolutional layers in the U-Net baseline by a residual-convolutional layer. We also propose an attention layer which leverages the multi-head attention scheme. Additionally, we generate multiple output masks with three different resolutions, which creates an ensemble of three outputs in the inference process to enhance the performance. Finally, we propose a combined loss function which leverages Focal loss and IoU loss to train the network. Our experiments on the development set of the Landslide4Sense challenge achieve an F1 score and an mIoU score of 84.07 and 76.07, respectively. Our best model setup outperforms the challenge baseline and the proposed U-Net baseline, improving the F1 score/mIoU score by 6.8/7.4 and 10.5/8.8, respectively.

artificial intelligence, landslide detection, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2312.16717

Country:

Europe > Austria > Vienna (0.14)
Asia > Japan > Honshū (0.14)

Genre: Research Report (0.40)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Robust-MBDL: A Robust Multi-branch Deep Learning Based Model for Remaining Useful Life Prediction and Operational Condition Identification of Rotating Machines

Tran, Khoa, Vu, Hai-Canh, Pham, Lam, Boudaoud, Nassim

arXiv.org Artificial IntelligenceDec-14-2023

The prediction of RUL has garnered significant attention from both academic researchers and industry professionals. This is because accurately predicting RUL can significantly enhance the effectiveness of predictive maintenance, leading to increased machine reliability and reduced incidences of failures and associated repair costs. Existing RUL prediction models generally fall within two primary categories: the model-based and data-driven approaches [8]. The model-based approach relies on a certain level of physical knowledge about machine degradation to predict RUL, such as employing theories of the Paris law for bearing defect growth [18] and reliability laws [42, 3, 44]. However, integrating such physical knowledge into models can be challenging, especially concerning complex machinery where such insights might not always be readily available.

artificial intelligence, machine learning, prediction, (16 more...)

arXiv.org Artificial Intelligence

2309.06157

Country:

Asia > China (0.28)
Asia > Vietnam > Da Nang > Da Nang (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Robust-MBFD: A Robust Deep Learning System for Motor Bearing Faults Detection Using Multiple Deep Learning Training Strategies and A Novel Double Loss Function

Tran, Khoa, Pham, Lam, Vu, Hai-Canh

arXiv.org Artificial IntelligenceOct-17-2023

This paper presents a comprehensive analysis of motor bearing fault detection (MBFD), which involves the task of identifying faults in a motor bearing based on its vibration. To this end, we first propose and evaluate various machine learning based systems for the MBFD task. Furthermore, we propose three deep learning based systems for the MBFD task, each of which explores one of the following training strategies: supervised learning, semi-supervised learning, and unsupervised learning. The proposed machine learning based systems and deep learning based systems are evaluated, compared, and then they are used to identify the best model for the MBFD task. We conducted extensive experiments on various benchmark datasets of motor bearing faults, including those from the American Society for Mechanical Failure Prevention Technology (MFPT), Case Western Reserve University Bearing Center (CWRU), and the Condition Monitoring of Bearing Damage in Electromechanical Drive Systems from Paderborn University (PU). The experimental results on different datasets highlight two main contributions of this study. First, we prove that deep learning based systems are more effective than machine learning based systems for the MBFD task. Second, we achieve a robust and general deep learning based system with a novel loss function for the MBFD task on several benchmark datasets, demonstrating its potential for real-life MBFD applications.

artificial intelligence, machine learning, multiple deep learning training strategy, (4 more...)

arXiv.org Artificial Intelligence

2310.11477

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An Inception-Residual-Based Architecture with Multi-Objective Loss for Detecting Respiratory Anomalies

Ngo, Dat, Pham, Lam, Phan, Huy, Tran, Minh, Jarchi, Delaram, Kolozali, Sefki

arXiv.org Artificial IntelligenceJun-19-2023

This paper presents a deep learning system applied for detecting anomalies from respiratory sound recordings. Initially, our system begins with audio feature extraction using Gammatone and Continuous Wavelet transformation. This step aims to transform the respiratory sound input into a two-dimensional spectrogram where both spectral and temporal features are presented. Then, our proposed system integrates Inception-residual-based backbone models combined with multi-head attention and multi-objective loss to classify respiratory anomalies. Instead of applying a simple concatenation approach by combining results from various spectrograms, we propose a Linear combination, which has the ability to regulate equally the contribution of each individual spectrogram throughout the training process. To evaluate the performance, we conducted experiments over the benchmark dataset of SPRSound (The Open-Source SJTU Paediatric Respiratory Sound) proposed by the IEEE BioCAS 2022 challenge. As regards the Score computed by an average between the average score and harmonic score, our proposed system gained significant improvements of 9.7%, 15.8%, 17.8%, and 16.1% in Task 1-1, Task 1-2, Task 2-1, and Task 2-2, respectively, compared to the challenge baseline system. Notably, we achieved the Top-1 performance in Task 2-1 and Task 2-2 with the highest Score of 74.5% and 53.9%, respectively.

artificial intelligence, machine learning, spectrogram, (19 more...)

arXiv.org Artificial Intelligence

2303.04104

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Low-complexity deep learning frameworks for acoustic scene classification using teacher-student scheme and multiple spectrograms

Pham, Lam, Ngo, Dat, Le, Cam, Jalali, Anahid, Schindler, Alexander

arXiv.org Artificial IntelligenceMay-16-2023

In this technical report, a low-complexity deep learning system for acoustic scene classification (ASC) is presented. The proposed system comprises two main phases: (Phase I) Training a teacher network; and (Phase II) training a student network using distilled knowledge from the teacher. In the first phase, the teacher, which presents a large footprint model, is trained. After training the teacher, the embeddings, which are the feature map of the second last layer of the teacher, are extracted. In the second phase, the student network, which presents a low complexity model, is trained with the embeddings extracted from the teacher. Our experiments conducted on DCASE 2023 Task 1 Development dataset have fulfilled the requirement of low-complexity and achieved the best classification accuracy of 57.4%, improving DCASE baseline by 14.5%.

artificial intelligence, classification, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2305.09463

Country: Europe > Austria (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Light-weight Deep Learning Model for Remote Sensing Image Classification

Pham, Lam, Le, Cam, Ngo, Dat, Nguyen, Anh, Lampert, Jasmin, Schindler, Alexander, McLoughlin, Ian

arXiv.org Artificial IntelligenceFeb-25-2023

In this paper, we present a high-performance and light-weight deep learning model for Remote Sensing Image Classification (RSIC), the task of identifying the aerial scene of a remote sensing image. To this end, we first valuate various benchmark convolutional neural network (CNN) architectures: MobileNet V1/V2, ResNet 50/151V2, InceptionV3/InceptionResNetV2, EfficientNet B0/B7, DenseNet 121/201, ConNeXt Tiny/Large. Then, the best performing models are selected to train a compact model in a teacher-student arrangement. The knowledge distillation from the teacher aims to achieve high performance with significantly reduced complexity. By conducting extensive experiments on the NWPU-RESISC45 benchmark, our proposed teacher-student models outperforms the state-of-the-art systems, and has potential to be applied on a wide rage of edge devices.

artificial intelligence, classification, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2302.13028

Country: Europe > Austria > Vienna (0.15)

Genre: Research Report (0.50)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Robust and Low Complexity Deep Learning Model for Remote Sensing Image Classification

Le, Cam, Pham, Lam, NVN, Nghia, Nguyen, Truong, Trang, Le Hong

arXiv.org Artificial IntelligenceDec-12-2022

In this paper, we present a robust and low complexity deep learning model for Remote Sensing Image Classification (RSIC), the task of identifying the scene of a remote sensing image. In particular, we firstly evaluate different low complexity and benchmark deep neural networks: MobileNetV1, MobileNetV2, NASNetMobile, and EfficientNetB0, which present the number of trainable parameters lower than 5 Million (M). After indicating best network architecture, we further improve the network performance by applying attention schemes to multiple feature maps extracted from middle layers of the network. To deal with the issue of increasing the model footprint as using attention schemes, we apply the quantization technique to satisfy the maximum of 20 MB memory occupation. By conducting extensive experiments on the benchmark datasets NWPU-RESISC45, we achieve a robust and low-complexity model, which is very competitive to the state-of-the-art systems and potential for real-life applications on edge devices.

artificial intelligence, classification, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2211.0282

Country: Asia > Vietnam > Da Nang > Da Nang (0.15)

Genre: Research Report > New Finding (0.46)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.99)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback