Goto

Collaborating Authors

 bidirectional lstm



Predicting the Unpredictable: Reproducible BiLSTM Forecasting of Incident Counts in the Global Terrorism Database (GTD)

arXiv.org Artificial Intelligence

We study short-horizon forecasting of weekly terrorism incident counts using the Global Terrorism Database (GTD, 1970--2016). We build a reproducible pipeline with fixed time-based splits and evaluate a Bidirectional LSTM (BiLSTM) against strong classical anchors (seasonal-naive, linear/ARIMA) and a deep LSTM-Attention baseline. On the held-out test set, the BiLSTM attains RMSE 6.38, outperforming LSTM-Attention (9.19; +30.6\%) and a linear lag-regression baseline (+35.4\% RMSE gain), with parallel improvements in MAE and MAPE. Ablations varying temporal memory, training-history length, spatial grain, lookback size, and feature groups show that models trained on long historical data generalize best; a moderate lookback (20--30 weeks) provides strong context; and bidirectional encoding is critical for capturing both build-up and aftermath patterns within the window. Feature-group analysis indicates that short-horizon structure (lagged counts and rolling statistics) contributes most, with geographic and casualty features adding incremental lift. We release code, configs, and compact result tables, and provide a data/ethics statement documenting GTD licensing and research-only use. Overall, the study offers a transparent, baseline-beating reference for GTD incident forecasting.


Cross-Lingual Multi-Granularity Framework for Interpretable Parkinson's Disease Diagnosis from Speech

arXiv.org Artificial Intelligence

Parkinson's Disease (PD) affects over 10 million people worldwide, with speech impairments in up to 89% of patients. Current speech-based detection systems analyze entire utterances, potentially overlooking the diagnostic value of specific phonetic elements. We developed a granularity-aware approach for multilingual PD detection using an automated pipeline that extracts time-aligned phonemes, syllables, and words from recordings. Using Italian, Spanish, and English datasets, we implemented a bidirectional LSTM with multi-head attention to compare diagnostic performance across the different granularity levels. Phoneme-level analysis achieved superior performance with AUROC of 93.78% +- 2.34% and accuracy of 92.17% +- 2.43%. This demonstrates enhanced diagnostic capability for cross-linguistic PD detection. Importantly, attention analysis revealed that the most informative speech features align with those used in established clinical protocols: sustained vowels (/a/, /e/, /o/, /i/) at phoneme level, diadochokinetic syllables (/ta/, /pa/, /la/, /ka/) at syllable level, and /pataka/ sequences at word level. Source code will be available at https://github.com/jetliqs/clearpd.


Continuous Saudi Sign Language Recognition: A Vision Transformer Approach

arXiv.org Artificial Intelligence

Sign language (SL) is an essential communication form for hearing-impaired and deaf people, enabling engagement within the broader society. Despite its significance, limited public awareness of SL often leads to inequitable access to educational and professional opportunities, thereby contributing to social exclusion, particularly in Saudi Arabia, where over 84,000 individuals depend on Saudi Sign Language (SSL) as their primary form of communication. Although certain technological approaches have helped to improve communication for individuals with hearing impairments, there continues to be an urgent requirement for more precise and dependable translation techniques, especially for Arabic sign language variants like SSL. Most state-of-the-art solutions have primarily focused on non-Arabic sign languages, resulting in a considerable absence of resources dedicated to Arabic sign language, specifically SSL. The complexity of the Arabic language and the prevalence of isolated sign language datasets that concentrate on individual words instead of continuous speech contribute to this issue. To address this gap, our research represents an important step in developing SSL resources. To address this, we introduce the first continuous Saudi Sign Language dataset called KAU-CSSL, focusing on complete sentences to facilitate further research and enable sophisticated recognition systems for SSL recognition and translation. Additionally, we propose a transformer-based model, utilizing a pretrained ResNet-18 for spatial feature extraction and a Transformer Encoder with Bidirectional LSTM for temporal dependencies, achieving 99.02\% accuracy at signer dependent mode and 77.71\% accuracy at signer independent mode. This development leads the way to not only improving communication tools for the SSL community but also making a substantial contribution to the wider field of sign language.



Automatic speech recognition for the Nepali language using CNN, bidirectional LSTM and ResNet

arXiv.org Artificial Intelligence

This paper presents an end-to-end deep learning model for Automatic Speech Recognition (ASR) that transcribes Nepali speech to text. The model was trained and tested on the OpenSLR (audio, text) dataset. The majority of the audio dataset have silent gaps at both ends which are clipped during dataset preprocessing for a more uniform mapping of audio frames and their corresponding texts. Mel Frequency Cepstral Coefficients (MFCCs) are used as audio features to feed into the model. The model having Bidirectional LSTM paired with ResNet and one-dimensional CNN produces the best results for this dataset out of all the models (neural networks with variations of LSTM, GRU, CNN, and ResNet) that have been trained so far. This novel model uses Connectionist Temporal Classification (CTC) function for loss calculation during training and CTC beam search decoding for predicting characters as the most likely sequence of Nepali text. On the test dataset, the character error rate (CER) of 17.06 percent has been achieved. The source code is available at: https://github.com/manishdhakal/ASR-Nepali-using-CNN-BiLSTM-ResNet.


Recurrent Neural Networks and Long Short-Term Memory Networks: Tutorial and Survey

arXiv.org Artificial Intelligence

Several solutions This is a tutorial paper on Recurrent Neural Network were proposed for this issue, some of which are close-toidentity (RNN), Long Short-Term Memory Network weight matrix (Mikolov et al., 2015), long delays (LSTM), and their variants. We start with a (Lin et al., 1995), leaky units (Jaeger et al., 2007; Sutskever dynamical system and backpropagation through & Hinton, 2010), and echo state networks (Jaeger & Haas, time for RNN. Then, we discuss the problems 2004; Jaeger, 2007). of gradient vanishing and explosion in longterm dependencies. We explain close-to-identity Sequence modeling requires both short-term and long-term weight matrix, long delays, leaky units, and echo dependencies. For example, consider the sentence "The state networks for solving this problem. Then, police is chasing the thief".


Detecting Fake Job Postings Using Bidirectional LSTM

arXiv.org Artificial Intelligence

Fake job postings have become prevalent in the online job market, posing significant challenges to job seekers and employers. Despite the growing need to address this problem, there is limited research that leverages deep learning techniques for the detection of fraudulent job advertisements. This study aims to fill the gap by employing a Bidirectional Long Short-Term Memory (Bi-LSTM) model to identify fake job advertisements. Our approach considers both numeric and text features, effectively capturing the underlying patterns and relationships within the data. The proposed model demonstrates a superior performance, achieving a 0.91 ROC AUC score and a 98.71% accuracy rate, indicating its potential for practical applications in the online job market. The findings of this research contribute to the development of robust, automated tools that can help combat the proliferation of fake job postings and improve the overall integrity of the job search process. Moreover, we discuss challenges, future research directions, and ethical considerations related to our approach, aiming to inspire further exploration and development of practical solutions to combat online job fraud.


Sarcasm Detection in News Headlines using Deep Learning, Word2Vec and LIWC

#artificialintelligence

In this article, we are going to discuss sarcasm detection in news headlines. For this purpose, we are going to use a public dataset from Kaggle. We are going to use deep learning methods to classify dataset moreover we are going to use Linguistic Inquiry and Word Count (LWIC) features. The dataset is public and you can download it from Kaggle using the link. The dataset was collected two news websites: TheOnion aims at producing sarcastic versions of current events, whereas HuffPost publishes real news.


A complete guide to speech enhancement

#artificialintelligence

Speech enhancement refers to techniques that aim to reduce distortions and improve one or more perceptual speech qualities. The enhanced speech is expected to be of superior quality with minimal or no noise in it. It is also known as an audio enhancement, denoiser, and noise reduction. Speech enhancement has wide applications including improving the quality of audio processing systems like speech recognition. Several past experiments have shown that this preprocessing has led to improved speech recognition.