AITopics | cnn-lstm

Collaborating Authors

cnn-lstm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reconnaissance Automatique des Langues des Signes : Une Approche Hybridée CNN-LSTM Basée sur Mediapipe

Takouchouang, Fraisse Sacré, Vinh, Ho Tuong

arXiv.org Artificial IntelligenceOct-28-2025

Sign languages play a crucial role in the communication of deaf communities, but they are often marginalized, limiting access to essential services such as healthcare and education. This study proposes an automatic sign language recognition system based on a hybrid CNN-LSTM architecture, using Mediapipe for gesture keypoint extraction. Developed with Python, TensorFlow and Streamlit, the system provides real-time gesture translation. The results show an average accuracy of 92\%, with very good performance for distinct gestures such as ``Hello'' and ``Thank you''. However, some confusions remain for visually similar gestures, such as ``Call'' and ``Yes''. This work opens up interesting perspectives for applications in various fields such as healthcare, education and public services.

artificial intelligence, langue, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.22011

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Breathing and Semantic Pause Detection and Exertion-Level Classification in Post-Exercise Speech

Wang, Yuyu, Xia, Wuyue, Yao, Huaxiu, Nie, Jingping

arXiv.org Artificial IntelligenceSep-22-2025

Post-exercise speech contains rich physiological and linguistic cues, often marked by semantic pauses, breathing pauses, and combined breathing-semantic pauses. Detecting these events enables assessment of recovery rate, lung function, and exertion-related abnormalities. However, existing works on identifying and distinguishing different types of pauses in this context are limited. In this work, building on a recently released dataset with synchronized audio and respiration signals, we provide systematic annotations of pause types. Using these annotations, we systematically conduct exploratory breathing and semantic pause detection and exertion-level classification across deep learning models (GRU, 1D CNN-LSTM, AlexNet, VGG16), acoustic features (MFCC, MFB), and layer-stratified Wav2Vec2 representations. We evaluate three setups-single feature, feature fusion, and a two-stage detection-classification cascade-under both classification and regression formulations. Results show per-type detection accuracy up to 89$\%$ for semantic, 55$\%$ for breathing, 86$\%$ for combined pauses, and 73$\%$overall, while exertion-level classification achieves 90.5$\%$ accuracy, outperformin prior work.

accuracy, artificial intelligence, machine learning, (9 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3737901.3768369

2509.15473

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Imitation Learning for Autonomous Driving: Insights from Real-World Testing

Dursun, Hidayet Ersin, Güven, Yusuf, Kumbasar, Tufan

arXiv.org Artificial IntelligenceApr-29-2025

This work focuses on the design of a deep learning-based autonomous driving system deployed and tested on the real-world MIT Racecar to assess its effectiveness in driving scenarios. The Deep Neural Network (DNN) translates raw image inputs into real-time steering commands in an end-to-end learning fashion, following the imitation learning framework. The key design challenge is to ensure that DNN predictions are accurate and fast enough, at a high sampling frequency, and result in smooth vehicle operation under different operating conditions. In this study, we design and compare various DNNs, to identify the most effective approach for real-time autonomous driving. In designing the DNNs, we adopted an incremental design approach that involved enhancing the model capacity and dataset to address the challenges of real-world driving scenarios. We designed a PD system, CNN, CNN-LSTM, and CNN-NODE, and evaluated their performance on the real-world MIT Racecar. While the PD system handled basic lane following, it struggled with sharp turns and lighting variations. The CNN improved steering but lacked temporal awareness, which the CNN-LSTM addressed as it resulted in smooth driving performance. The CNN-NODE performed similarly to the CNN-LSTM in handling driving dynamics, yet with slightly better driving performance. The findings of this research highlight the importance of iterative design processes in developing robust DNNs for autonomous driving applications. The experimental video is available at https://www.youtube.com/watch?v=FNNYgU--iaY.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2504.18847

Country: Asia > Middle East > Republic of Türkiye (0.15)

Genre: Research Report > New Finding (0.87)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ICPR 2024 Competition on Rider Intention Prediction

Gangisetty, Shankar, Wasi, Abdul, Rai, Shyam Nandan, Jawahar, C. V., Raj, Sajay, Prajapati, Manish, Choudhary, Ayesha, Chandra, Aaryadev, Chandan, Dev, Chand, Shireen, Mukherjee, Suvaditya

arXiv.org Artificial IntelligenceMar-11-2025

The recent surge in the vehicle market has led to an alarming increase in road accidents. This underscores the critical importance of enhancing road safety measures, particularly for vulnerable road users like motorcyclists. Hence, we introduce the rider intention prediction (RIP) competition that aims to address challenges in rider safety by proactively predicting maneuvers before they occur, thereby strengthening rider safety. This capability enables the riders to react to the potential incorrect maneuvers flagged by advanced driver assistance systems (ADAS). We collect a new dataset, namely, rider action anticipation dataset (RAAD) for the competition consisting of two tasks: single-view RIP and multi-view RIP. The dataset incorporates a spectrum of traffic conditions and challenging navigational maneuvers on roads with varying lighting conditions. For the competition, we received seventy-five registrations and five team submissions for inference of which we compared the methods of the top three performing teams on both the RIP tasks: one state-space model (Mamba2) and two learning-based approaches (SVM and CNN-LSTM). The results indicate that the state-space model outperformed the other methods across the entire dataset, providing a balanced performance across maneuver classes. The SVM-based RIP method showed the second-best performance when using random sampling and SMOTE. However, the CNN-LSTM method underperformed, primarily due to class imbalance issues, particularly struggling with minority classes. This paper details the proposed RAAD dataset and provides a summary of the submissions for the RIP 2024 competition.

dataset, maneuver, prediction, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-80139-6_3

2503.08437

Country:

Asia > India > NCT > New Delhi (0.04)
Asia > India > Maharashtra > Mumbai (0.04)
North America > United States (0.04)
(2 more...)

Genre:

Research Report (0.64)
Contests & Prizes (0.62)
Overview (0.48)

Industry: Transportation > Ground > Road (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Add feedback

Regional climate projections using a deep-learning-based model-ranking and downscaling framework: Application to European climate zones

Loganathan, Parthiban, Zea, Elias, Vinuesa, Ricardo, Otero, Evelyn

arXiv.org Artificial IntelligenceFeb-28-2025

Accurate regional climate forecast calls for high-resolution downscaling of Global Climate Models (GCMs). This work presents a deep-learning-based multi-model evaluation and downscaling framework ranking 32 Coupled Model Intercomparison Project Phase 6 (CMIP6) models using a Deep Learning-TOPSIS (DL-TOPSIS) mechanism and so refines outputs using advanced deep-learning models. Using nine performance criteria, five K\"oppen-Geiger climate zones -- Tropical, Arid, Temperate, Continental, and Polar -- are investigated over four seasons. While TaiESM1 and CMCC-CM2-SR5 show notable biases, ranking results show that NorESM2-LM, GISS-E2-1-G, and HadGEM3-GC31-LL outperform other models. Four models contribute to downscaling the top-ranked GCMs to 0.1$^{\circ}$ resolution: Vision Transformer (ViT), Geospatial Spatiotemporal Transformer with Attention and Imbalance-Aware Network (GeoSTANet), CNN-LSTM, and CNN-Long Short-Term Memory (ConvLSTM). Effectively capturing temperature extremes (TXx, TNn), GeoSTANet achieves the highest accuracy (Root Mean Square Error (RMSE) = 1.57$^{\circ}$C, Kling-Gupta Efficiency (KGE) = 0.89, Nash-Sutcliffe Efficiency (NSE) = 0.85, Correlation ($r$) = 0.92), so reducing RMSE by 20% over ConvLSTM. CNN-LSTM and ConvLSTM do well in Continental and Temperate zones; ViT finds fine-scale temperature fluctuations difficult. These results confirm that multi-criteria ranking improves GCM selection for regional climate studies and transformer-based downscaling exceeds conventional deep-learning methods. This framework offers a scalable method to enhance high-resolution climate projections, benefiting impact assessments and adaptation plans.

architecture, climate zone, cmip6 model, (15 more...)

arXiv.org Artificial Intelligence

2502.20132

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States (0.05)
Europe > France (0.04)
(15 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Energy (0.93)
Media > Television (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Enhanced Anomaly Detection in IoMT Networks using Ensemble AI Models on the CICIoMT2024 Dataset

Chandekar, Prathamesh, Mehta, Mansi, Chandan, Swet

arXiv.org Artificial IntelligenceFeb-17-2025

The rapid proliferation of Internet of Medical Things (IoMT) devices in healthcare has introduced unique cybersecurity challenges, primarily due to the diverse communication protocols and critical nature of these devices This research aims to develop an advanced, real-time anomaly detection framework tailored for IoMT network traffic, leveraging AI/ML models and the CICIoMT2024 dataset By integrating multi-protocol (MQTT, WiFi), attack-specific (DoS, DDoS), time-series (active/idle states), and device-specific (Bluetooth) data, our study captures a comprehensive range of IoMT interactions As part of our data analysis, various machine learning techniques are employed which include an ensemble model using XGBoost for improved performance against specific attack types, sequential models comprised of LSTM and CNN-LSTM that leverage time dependencies, and unsupervised models such as Autoencoders and Isolation Forest that are good in general anomaly detection The results of the experiment prove with an ensemble model lowers false positive rates and reduced detections.

artificial intelligence, data mining, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2502.11854

Country: Asia > India (0.15)

Genre: Research Report > New Finding (0.88)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government > Military (0.89)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Optimal Signal Decomposition-based Multi-Stage Learning for Battery Health Estimation

Pamshetti, Vijay Babu, Zhang, Wei, Tseng, King Jet, Ng, Bor Kiat, Yan, Qingyu

arXiv.org Artificial IntelligenceJan-23-2025

Battery health estimation is fundamental to ensure battery safety and reduce cost. However, achieving accurate estimation has been challenging due to the batteries' complex nonlinear aging patterns and capacity regeneration phenomena. In this paper, we propose OSL, an optimal signal decomposition-based multi-stage machine learning for battery health estimation. OSL treats battery signals optimally. It uses optimized variational mode decomposition to extract decomposed signals capturing different frequency bands of the original battery signals. It also incorporates a multi-stage learning process to analyze both spatial and temporal battery features effectively. An experimental study is conducted with a public battery aging dataset. OSL demonstrates exceptional performance with a mean error of just 0.26%. It significantly outperforms comparison algorithms, both those without and those with suboptimal signal decomposition and analysis. OSL considers practical battery challenges and can be integrated into real-world battery management systems, offering a good impact on battery monitoring and optimization.

artificial intelligence, evolutionary algorithm, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2501.16377

Country:

North America > United States (0.15)
Asia > Singapore (0.06)

Genre: Research Report > New Finding (0.89)

Industry:

Energy > Energy Storage (1.00)
Electrical Industrial Apparatus (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Hybrid CNN-LSTM based Indoor Pedestrian Localization with CSI Fingerprint Maps

Emad-ud-din, Muhammad

arXiv.org Artificial IntelligenceDec-29-2024

The paper presents a novel Wi-Fi fingerprinting system that uses Channel State Information (CSI) data for fine-grained pedestrian localization. The proposed system exploits the frequency diversity and spatial diversity of the features extracted from CSI data to generate a 2D+channel image termed as a CSI Fingerprint Map. We then use this CSI Fingerprint Map representation of CSI data to generate a pedestrian trajectory hypothesis using a hybrid architecture that combines a Convolutional Neural Network and a Long Short-Term Memory Recurrent Neural Network model. The proposed architecture exploits the temporal and spatial relationship information among the CSI data observations gathered at neighboring locations. A particle filter is then employed to separate out the most likely hypothesis matching a human walk model. The experimental performance of our method is compared to existing deep learning localization methods such ConFi, DeepFi and to a self-developed temporal-feature based LSTM based location classifier. The experimental results show marked improvement with an average RMSE of 0.36 m in a moderately dynamic and 0.17 m in a static environment. Our method is essentially a proof of concept that with (1) sparse availability of observations, (2) limited infrastructure requirements, (3) moderate level of short-term and long-term noise in the training and testing environment, reliable fine-grained Wi-Fi based pedestrian localization is a potential option.

artificial intelligence, localization, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2412.13601

Country:

Europe (0.93)
North America > United States > Texas (0.28)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Machine Learning-based sEMG Signal Classification for Hand Gesture Recognition

Aarotale, Parshuram N., Rattani, Ajita

arXiv.org Artificial IntelligenceNov-23-2024

EMG-based hand gesture recognition uses electromyographic~(EMG) signals to interpret and classify hand movements by analyzing electrical activity generated by muscle contractions. It has wide applications in prosthesis control, rehabilitation training, and human-computer interaction. Using electrodes placed on the skin, the EMG sensor captures muscle signals, which are processed and filtered to reduce noise. Numerous feature extraction and machine learning algorithms have been proposed to extract and classify muscle signals to distinguish between various hand gestures. This paper aims to benchmark the performance of EMG-based hand gesture recognition using novel feature extraction methods, namely, fused time-domain descriptors, temporal-spatial descriptors, and wavelet transform-based features, combined with the state-of-the-art machine and deep learning models. Experimental investigations on the Grabmyo dataset demonstrate that the 1D Dilated CNN performed the best with an accuracy of $97\%$ using fused time-domain descriptors such as power spectral moments, sparsity, irregularity factor and waveform length ratio. Similarly, on the FORS-EMG dataset, random forest performed the best with an accuracy of $94.95\%$ using temporal-spatial descriptors (which include time domain features along with additional features such as coefficient of variation (COV), and Teager-Kaiser energy operator (TKEO)).

accuracy, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.15655

Country:

North America > United States > Texas (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.68)
Health & Medicine > Health Care Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision > Gesture Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Comparative Study of Convolutional and Recurrent Neural Networks for Storm Surge Prediction in Tampa Bay

Ghahfarokhi, Mandana Farhang, Sonbolestan, Seyed Hossein, Zamanizadeh, Mahta

arXiv.org Artificial IntelligenceAug-11-2024

In this paper, we compare the performance of three common deep learning architectures, CNN-LSTM, LSTM, and 3D-CNN, in the context of surrogate storm surge modeling. The study site for this paper is the Tampa Bay area in Florida. Using high-resolution atmospheric data from the reanalysis models and historical water level data from NOAA tide stations, we trained and tested these models to evaluate their performance. Our findings indicate that the CNN-LSTM model outperforms the other architectures, achieving a test loss of 0.010 and an R-squared (R2) score of 0.84. The LSTM model, although it achieved the lowest training loss of 0.007 and the highest training R2 of 0.88, exhibited poorer generalization with a test loss of 0.014 and an R2 of 0.77. The 3D-CNN model showed reasonable performance with a test loss of 0.011 and an R2 of 0.82 but displayed instability under extreme conditions. A case study on Hurricane Ian, which caused a significant negative surge of -1.5 meters in Tampa Bay indicates the CNN-LSTM model's robustness and accuracy in extreme scenarios.

architecture, prediction, surge, (16 more...)

arXiv.org Artificial Intelligence

2408.05797

Country:

North America > United States > Virginia > Norfolk City County > Norfolk (0.05)
North America > United States > Florida > Pinellas County > St. Petersburg (0.05)
North America > Mexico (0.05)
(2 more...)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback