Dibrugarh
NBF at SemEval-2025 Task 5: Light-Burst Attention Enhanced System for Multilingual Subject Recommendation
Islam, Baharul, Ahmad, Nasim, Barbhuiya, Ferdous Ahmed, Dey, Kuntal
We present our system submission for SemEval 2025 Task 5, which focuses on cross-lingual subject classification in the English and German academic domains. Our approach leverages bilingual data during training, employing negative sampling and a margin-based retrieval objective. We demonstrate that a dimension-as-token self-attention mechanism designed with significantly reduced internal dimensions can effectively encode sentence embeddings for subject retrieval. In quantitative evaluation, our system achieved an average recall rate of 32.24% in the general quantitative setting (all subjects), 43.16% and 31.53% of the general qualitative evaluation methods with minimal GPU usage, highlighting their competitive performance. Our results demonstrate that our approach is effective in capturing relevant subject information under resource constraints, although there is still room for improvement.
Spatial-Temporal Bearing Fault Detection Using Graph Attention Networks and LSTM
Singh, Moirangthem Tiken, Prasad, Rabinder Kumar, Michael, Gurumayum Robert, Singh, N. Hemarjit, Kaphungkui, N. K.
Purpose: This paper aims to enhance bearing fault diagnosis in industrial machinery by introducing a novel method that combines Graph Attention Network (GAT) and Long Short-Term Memory (LSTM) networks. This approach captures both spatial and temporal dependencies within sensor data, improving the accuracy of bearing fault detection under various conditions. Methodology: The proposed method converts time series sensor data into graph representations. GAT captures spatial relationships between components, while LSTM models temporal patterns. The model is validated using the Case Western Reserve University (CWRU) Bearing Dataset, which includes data under different horsepower levels and both normal and faulty conditions. Its performance is compared with methods such as K-Nearest Neighbors (KNN), Local Outlier Factor (LOF), Isolation Forest (IForest) and GNN-based method for bearing fault detection (GNNBFD). Findings: The model achieved outstanding results, with precision, recall, and F1-scores reaching 100\% across various testing conditions. It not only identifies faults accurately but also generalizes effectively across different operational scenarios, outperforming traditional methods. Originality: This research presents a unique combination of GAT and LSTM for fault detection, overcoming the limitations of traditional time series methods by capturing complex spatial-temporal dependencies. Its superior performance demonstrates significant potential for predictive maintenance in industrial applications.
Heterogeneous Graph Auto-Encoder for CreditCard Fraud Detection
Singh, Moirangthem Tiken, Prasad, Rabinder Kumar, Michael, Gurumayum Robert, Kaphungkui, N K, Singh, N. Hemarjit
The digital revolution has significantly impacted financial transactions, leading to a notable increase in credit card usage. However, this convenience comes with a trade-off: a substantial rise in fraudulent activities. Traditional machine learning methods for fraud detection often struggle to capture the inherent interconnectedness within financial data. This paper proposes a novel approach for credit card fraud detection that leverages Graph Neural Networks (GNNs) with attention mechanisms applied to heterogeneous graph representations of financial data. Unlike homogeneous graphs, heterogeneous graphs capture intricate relationships between various entities in the financial ecosystem, such as cardholders, merchants, and transactions, providing a richer and more comprehensive data representation for fraud analysis. To address the inherent class imbalance in fraud data, where genuine transactions significantly outnumber fraudulent ones, the proposed approach integrates an autoencoder. This autoencoder, trained on genuine transactions, learns a latent representation and flags deviations during reconstruction as potential fraud. This research investigates two key questions: (1) How effectively can a GNN with an attention mechanism detect and prevent credit card fraud when applied to a heterogeneous graph? (2) How does the efficacy of the autoencoder with attention approach compare to traditional methods? The results are promising, demonstrating that the proposed model outperforms benchmark algorithms such as Graph Sage and FI-GRL, achieving a superior AUC-PR of 0.89 and an F1-score of 0.81. This research significantly advances fraud detection systems and the overall security of financial transactions by leveraging GNNs with attention mechanisms and addressing class imbalance through an autoencoder.
Explainable Artificial Intelligence for Drug Discovery and Development -- A Comprehensive Survey
Alizadehsani, Roohallah, Oyelere, Solomon Sunday, Hussain, Sadiq, Calixto, Rene Ripardo, de Albuquerque, Victor Hugo C., Roshanzamir, Mohamad, Rahouti, Mohamed, Jagatheesaperumal, Senthil Kumar
The field of drug discovery has experienced a remarkable transformation with the advent of artificial intelligence (AI) and machine learning (ML) technologies. However, as these AI and ML models are becoming more complex, there is a growing need for transparency and interpretability of the models. Explainable Artificial Intelligence (XAI) is a novel approach that addresses this issue and provides a more interpretable understanding of the predictions made by machine learning models. In recent years, there has been an increasing interest in the application of XAI techniques to drug discovery. This review article provides a comprehensive overview of the current state-of-the-art in XAI for drug discovery, including various XAI methods, their application in drug discovery, and the challenges and limitations of XAI techniques in drug discovery. The article also covers the application of XAI in drug discovery, including target identification, compound design, and toxicity prediction. Furthermore, the article suggests potential future research directions for the application of XAI in drug discovery. The aim of this review article is to provide a comprehensive understanding of the current state of XAI in drug discovery and its potential to transform the field.
A Hybrid Deep Spatio-Temporal Attention-Based Model for Parkinson's Disease Diagnosis Using Resting State EEG Signals
Delfan, Niloufar, Shahsavari, Mohammadreza, Hussain, Sadiq, Damaševičius, Robertas, Acharya, U. Rajendra
Parkinson's disease (PD), a severe and progressive neurological illness, affects millions of individuals worldwide. For effective treatment and management of PD, an accurate and early diagnosis is crucial. This study presents a deep learning-based model for the diagnosis of PD using resting state electroencephalogram (EEG) signal. The objective of the study is to develop an automated model that can extract complex hidden nonlinear features from EEG and demonstrate its generalizability on unseen data. The model is designed using a hybrid model, consists of convolutional neural network (CNN), bidirectional gated recurrent unit (Bi-GRU), and attention mechanism. The proposed method is evaluated on three public datasets (Uc San Diego Dataset, PRED-CT, and University of Iowa (UI) dataset), with one dataset used for training and the other two for evaluation. The results show that the proposed model can accurately diagnose PD with high performance on both the training and hold-out datasets. The model also performs well even when some part of the input information is missing. The results of this work have significant implications for patient treatment and for ongoing investigations into the early detection of Parkinson's disease. The suggested model holds promise as a non-invasive and reliable technique for PD early detection utilizing resting state EEG.
Automated Detection and Forecasting of COVID-19 using Deep Learning Techniques: A Review
Shoeibi, Afshin, Khodatars, Marjane, Jafari, Mahboobeh, Ghassemi, Navid, Sadeghi, Delaram, Moridian, Parisa, Khadem, Ali, Alizadehsani, Roohallah, Hussain, Sadiq, Zare, Assef, Sani, Zahra Alizadeh, Khozeimeh, Fahime, Nahavandi, Saeid, Acharya, U. Rajendra, Gorriz, Juan M.
Coronavirus, or COVID-19, is a hazardous disease that has endangered the health of many people around the world by directly affecting the lungs. COVID-19 is a medium-sized, coated virus with a single-stranded RNA, and also has one of the largest RNA genomes and is approximately 120 nm. The X-Ray and computed tomography (CT) imaging modalities are widely used to obtain a fast and accurate medical diagnosis. Identifying COVID-19 from these medical images is extremely challenging as it is time-consuming and prone to human errors. Hence, artificial intelligence (AI) methodologies can be used to obtain consistent high performance. Among the AI methods, deep learning (DL) networks have gained popularity recently compared to conventional machine learning (ML). Unlike ML, all stages of feature extraction, feature selection, and classification are accomplished automatically in DL models. In this paper, a complete survey of studies on the application of DL techniques for COVID-19 diagnostic and segmentation of lungs is discussed, concentrating on works that used X-Ray and CT images. Additionally, a review of papers on the forecasting of coronavirus prevalence in different parts of the world with DL is presented. Lastly, the challenges faced in the detection of COVID-19 using DL techniques and directions for future research are discussed.
A Brief Review of Explainable Artificial Intelligence in Healthcare
Sadeghi, Zahra, Alizadehsani, Roohallah, Cifci, Mehmet Akif, Kausar, Samina, Rehman, Rizwan, Mahanta, Priyakshi, Bora, Pranjal Kumar, Almasri, Ammar, Alkhawaldeh, Rami S., Hussain, Sadiq, Alatas, Bilal, Shoeibi, Afshin, Moosaei, Hossein, Hladik, Milan, Nahavandi, Saeid, Pardalos, Panos M.
XAI refers to the techniques and methods for building AI applications which assist end users to interpret output and predictions of AI models. Black box AI applications in high-stakes decision-making situations, such as medical domain have increased the demand for transparency and explainability since wrong predictions may have severe consequences. Model explainability and interpretability are vital successful deployment of AI models in healthcare practices. AI applications' underlying reasoning needs to be transparent to clinicians in order to gain their trust. This paper presents a systematic review of XAI aspects and challenges in the healthcare domain. The primary goals of this study are to review various XAI methods, their challenges, and related machine learning models in healthcare. The methods are discussed under six categories: Features-oriented methods, global methods, concept models, surrogate models, local pixel-based methods, and human-centric methods. Most importantly, the paper explores XAI role in healthcare problems to clarify its necessity in safety-critical applications. The paper intends to establish a comprehensive understanding of XAI-related applications in the healthcare field by reviewing the related experimental results. To facilitate future research for filling research gaps, the importance of XAI models from different viewpoints and their limitations are investigated.
BERT-Deep CNN: State-of-the-Art for Sentiment Analysis of COVID-19 Tweets
Joloudari, Javad Hassannataj, Hussain, Sadiq, Nematollahi, Mohammad Ali, Bagheri, Rouhollah, Fazl, Fatemeh, Alizadehsani, Roohallah, Lashgari, Reza, Talukder, Ashis
The free flow of information has been accelerated by the rapid development of social media technology. There has been a significant social and psychological impact on the population due to the outbreak of Coronavirus disease (COVID-19). The COVID-19 pandemic is one of the current events being discussed on social media platforms. In order to safeguard societies from this pandemic, studying people's emotions on social media is crucial. As a result of their particular characteristics, sentiment analysis of texts like tweets remains challenging. Sentiment analysis is a powerful text analysis tool. It automatically detects and analyzes opinions and emotions from unstructured data. Texts from a wide range of sources are examined by a sentiment analysis tool, which extracts meaning from them, including emails, surveys, reviews, social media posts, and web articles. To evaluate sentiments, natural language processing (NLP) and machine learning techniques are used, which assign weights to entities, topics, themes, and categories in sentences or phrases. Machine learning tools learn how to detect sentiment without human intervention by examining examples of emotions in text. In a pandemic situation, analyzing social media texts to uncover sentimental trends can be very helpful in gaining a better understanding of society's needs and predicting future trends. We intend to study society's perception of the COVID-19 pandemic through social media using state-of-the-art BERT and Deep CNN models. The superiority of BERT models over other deep models in sentiment analysis is evident and can be concluded from the comparison of the various research studies mentioned in this article.
Effective Class-Imbalance learning based on SMOTE and Convolutional Neural Networks
Joloudari, Javad Hassannataj, Marefat, Abdolreza, Nematollahi, Mohammad Ali, Oyelere, Solomon Sunday, Hussain, Sadiq
Imbalanced Data (ID) is a problem that deters Machine Learning (ML) models for achieving satisfactory results. ID is the occurrence of a situation where the quantity of the samples belonging to one class outnumbers that of the other by a wide margin, making such models' learning process biased towards the majority class. In recent years, to address this issue, several solutions have been put forward, which opt for either synthetically generating new data for the minority class or reducing the number of majority classes for balancing the data. Hence, in this paper, we investigate the effectiveness of methods based on Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs), mixed with a variety of well-known imbalanced data solutions meaning oversampling and undersampling. To evaluate our methods, we have used KEEL, breast cancer, and Z-Alizadeh Sani datasets. In order to achieve reliable results, we conducted our experiments 100 times with randomly shuffled data distributions. The classification results demonstrate that the mixed Synthetic Minority Oversampling Technique (SMOTE)-Normalization-CNN outperforms different methodologies achieving 99.08% accuracy on the 24 imbalanced datasets. Therefore, the proposed mixed model can be applied to imbalanced binary classification problems on other real datasets.
UncertaintyFuseNet: Robust Uncertainty-aware Hierarchical Feature Fusion Model with Ensemble Monte Carlo Dropout for COVID-19 Detection
Abdar, Moloud, Salari, Soorena, Qahremani, Sina, Lam, Hak-Keung, Karray, Fakhri, Hussain, Sadiq, Khosravi, Abbas, Acharya, U. Rajendra, Makarenkov, Vladimir, Nahavandi, Saeid
Abstract--The COVID-19 (Coronavirus disease 2019) pandemic Index Terms--COVID-19, Deep learning, Early fusion, Feature has become a major global threat to human health and fusion, Uncertainty quantification. Such automatic systems are usually based on traditional machine learning or deep learning methods. We argue that the uncertainty of the model's predictions PCR has a low sensitivity. H.K. Lam is with the Centre for Robotics Research, Department of F. Karray is with the Centre for Pattern Analysis and Machine Intelligence, Department of Electrical and Computer Engineering, University of S. Hussain is with the System Administrator, Dibrugarh University, U. R. Acharya is with the Department of Electronics and Computer V. Makarenkov is with the Department of Computer Science, In recent years, deep learning models have had the Its areas of research and application have been growing widespread applicability not only in medical imaging field drastically. These models have allowed the information fusion to change from centralized also been extensively applied for COVID-19 detection. It is single node information fusion to distributed information critical to discriminate COVID-19 from other forms of pneumonia fusion. Farooq et al. [8] introduced an open-access Modern medicine nowadays depends on amalgamation dataset and the open-source code of their implementation of data and information from manifold sources that include using a CNN framework for distinguishing COVID-19 from structured imaging data, laboratory data, unstructured analogous pneumonia cohorts from chest X-ray images. The narrative data, and even observational or audio authors designed their COVIDResNet model by utilizing a data in some cases [22].