cnn-lstm
Reconnaissance Automatique des Langues des Signes : Une Approche Hybridée CNN-LSTM Basée sur Mediapipe
Takouchouang, Fraisse Sacré, Vinh, Ho Tuong
Sign languages play a crucial role in the communication of deaf communities, but they are often marginalized, limiting access to essential services such as healthcare and education. This study proposes an automatic sign language recognition system based on a hybrid CNN-LSTM architecture, using Mediapipe for gesture keypoint extraction. Developed with Python, TensorFlow and Streamlit, the system provides real-time gesture translation. The results show an average accuracy of 92\%, with very good performance for distinct gestures such as ``Hello'' and ``Thank you''. However, some confusions remain for visually similar gestures, such as ``Call'' and ``Yes''. This work opens up interesting perspectives for applications in various fields such as healthcare, education and public services.
Breathing and Semantic Pause Detection and Exertion-Level Classification in Post-Exercise Speech
Wang, Yuyu, Xia, Wuyue, Yao, Huaxiu, Nie, Jingping
Post-exercise speech contains rich physiological and linguistic cues, often marked by semantic pauses, breathing pauses, and combined breathing-semantic pauses. Detecting these events enables assessment of recovery rate, lung function, and exertion-related abnormalities. However, existing works on identifying and distinguishing different types of pauses in this context are limited. In this work, building on a recently released dataset with synchronized audio and respiration signals, we provide systematic annotations of pause types. Using these annotations, we systematically conduct exploratory breathing and semantic pause detection and exertion-level classification across deep learning models (GRU, 1D CNN-LSTM, AlexNet, VGG16), acoustic features (MFCC, MFB), and layer-stratified Wav2Vec2 representations. We evaluate three setups-single feature, feature fusion, and a two-stage detection-classification cascade-under both classification and regression formulations. Results show per-type detection accuracy up to 89$\%$ for semantic, 55$\%$ for breathing, 86$\%$ for combined pauses, and 73$\%$overall, while exertion-level classification achieves 90.5$\%$ accuracy, outperformin prior work.
- Asia > China > Hong Kong (0.05)
- North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
- North America > United States > New York > New York County > New York City (0.04)
Imitation Learning for Autonomous Driving: Insights from Real-World Testing
Dursun, Hidayet Ersin, Güven, Yusuf, Kumbasar, Tufan
This work focuses on the design of a deep learning-based autonomous driving system deployed and tested on the real-world MIT Racecar to assess its effectiveness in driving scenarios. The Deep Neural Network (DNN) translates raw image inputs into real-time steering commands in an end-to-end learning fashion, following the imitation learning framework. The key design challenge is to ensure that DNN predictions are accurate and fast enough, at a high sampling frequency, and result in smooth vehicle operation under different operating conditions. In this study, we design and compare various DNNs, to identify the most effective approach for real-time autonomous driving. In designing the DNNs, we adopted an incremental design approach that involved enhancing the model capacity and dataset to address the challenges of real-world driving scenarios. We designed a PD system, CNN, CNN-LSTM, and CNN-NODE, and evaluated their performance on the real-world MIT Racecar. While the PD system handled basic lane following, it struggled with sharp turns and lighting variations. The CNN improved steering but lacked temporal awareness, which the CNN-LSTM addressed as it resulted in smooth driving performance. The CNN-NODE performed similarly to the CNN-LSTM in handling driving dynamics, yet with slightly better driving performance. The findings of this research highlight the importance of iterative design processes in developing robust DNNs for autonomous driving applications. The experimental video is available at https://www.youtube.com/watch?v=FNNYgU--iaY.
- Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
- Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
- Asia > Japan > Honshū > Kansai > Hyogo Prefecture > Kobe (0.04)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (1.00)
- Information Technology > Robotics & Automation (0.91)
- Leisure & Entertainment > Sports > Motorsports (0.71)
ICPR 2024 Competition on Rider Intention Prediction
Gangisetty, Shankar, Wasi, Abdul, Rai, Shyam Nandan, Jawahar, C. V., Raj, Sajay, Prajapati, Manish, Choudhary, Ayesha, Chandra, Aaryadev, Chandan, Dev, Chand, Shireen, Mukherjee, Suvaditya
The recent surge in the vehicle market has led to an alarming increase in road accidents. This underscores the critical importance of enhancing road safety measures, particularly for vulnerable road users like motorcyclists. Hence, we introduce the rider intention prediction (RIP) competition that aims to address challenges in rider safety by proactively predicting maneuvers before they occur, thereby strengthening rider safety. This capability enables the riders to react to the potential incorrect maneuvers flagged by advanced driver assistance systems (ADAS). We collect a new dataset, namely, rider action anticipation dataset (RAAD) for the competition consisting of two tasks: single-view RIP and multi-view RIP. The dataset incorporates a spectrum of traffic conditions and challenging navigational maneuvers on roads with varying lighting conditions. For the competition, we received seventy-five registrations and five team submissions for inference of which we compared the methods of the top three performing teams on both the RIP tasks: one state-space model (Mamba2) and two learning-based approaches (SVM and CNN-LSTM). The results indicate that the state-space model outperformed the other methods across the entire dataset, providing a balanced performance across maneuver classes. The SVM-based RIP method showed the second-best performance when using random sampling and SMOTE. However, the CNN-LSTM method underperformed, primarily due to class imbalance issues, particularly struggling with minority classes. This paper details the proposed RAAD dataset and provides a summary of the submissions for the RIP 2024 competition.
- Research Report (0.64)
- Contests & Prizes (0.62)
- Overview (0.48)
Regional climate projections using a deep-learning-based model-ranking and downscaling framework: Application to European climate zones
Loganathan, Parthiban, Zea, Elias, Vinuesa, Ricardo, Otero, Evelyn
Accurate regional climate forecast calls for high-resolution downscaling of Global Climate Models (GCMs). This work presents a deep-learning-based multi-model evaluation and downscaling framework ranking 32 Coupled Model Intercomparison Project Phase 6 (CMIP6) models using a Deep Learning-TOPSIS (DL-TOPSIS) mechanism and so refines outputs using advanced deep-learning models. Using nine performance criteria, five K\"oppen-Geiger climate zones -- Tropical, Arid, Temperate, Continental, and Polar -- are investigated over four seasons. While TaiESM1 and CMCC-CM2-SR5 show notable biases, ranking results show that NorESM2-LM, GISS-E2-1-G, and HadGEM3-GC31-LL outperform other models. Four models contribute to downscaling the top-ranked GCMs to 0.1$^{\circ}$ resolution: Vision Transformer (ViT), Geospatial Spatiotemporal Transformer with Attention and Imbalance-Aware Network (GeoSTANet), CNN-LSTM, and CNN-Long Short-Term Memory (ConvLSTM). Effectively capturing temperature extremes (TXx, TNn), GeoSTANet achieves the highest accuracy (Root Mean Square Error (RMSE) = 1.57$^{\circ}$C, Kling-Gupta Efficiency (KGE) = 0.89, Nash-Sutcliffe Efficiency (NSE) = 0.85, Correlation ($r$) = 0.92), so reducing RMSE by 20% over ConvLSTM. CNN-LSTM and ConvLSTM do well in Continental and Temperate zones; ViT finds fine-scale temperature fluctuations difficult. These results confirm that multi-criteria ranking improves GCM selection for regional climate studies and transformer-based downscaling exceeds conventional deep-learning methods. This framework offers a scalable method to enhance high-resolution climate projections, benefiting impact assessments and adaptation plans.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States (0.05)
- Europe > France (0.04)
- (16 more...)
- Energy (0.93)
- Media > Television (0.34)
Enhanced Anomaly Detection in IoMT Networks using Ensemble AI Models on the CICIoMT2024 Dataset
Chandekar, Prathamesh, Mehta, Mansi, Chandan, Swet
The rapid proliferation of Internet of Medical Things (IoMT) devices in healthcare has introduced unique cybersecurity challenges, primarily due to the diverse communication protocols and critical nature of these devices This research aims to develop an advanced, real-time anomaly detection framework tailored for IoMT network traffic, leveraging AI/ML models and the CICIoMT2024 dataset By integrating multi-protocol (MQTT, WiFi), attack-specific (DoS, DDoS), time-series (active/idle states), and device-specific (Bluetooth) data, our study captures a comprehensive range of IoMT interactions As part of our data analysis, various machine learning techniques are employed which include an ensemble model using XGBoost for improved performance against specific attack types, sequential models comprised of LSTM and CNN-LSTM that leverage time dependencies, and unsupervised models such as Autoencoders and Isolation Forest that are good in general anomaly detection The results of the experiment prove with an ensemble model lowers false positive rates and reduced detections.
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- Government > Military (0.89)
Optimal Signal Decomposition-based Multi-Stage Learning for Battery Health Estimation
Pamshetti, Vijay Babu, Zhang, Wei, Tseng, King Jet, Ng, Bor Kiat, Yan, Qingyu
Battery health estimation is fundamental to ensure battery safety and reduce cost. However, achieving accurate estimation has been challenging due to the batteries' complex nonlinear aging patterns and capacity regeneration phenomena. In this paper, we propose OSL, an optimal signal decomposition-based multi-stage machine learning for battery health estimation. OSL treats battery signals optimally. It uses optimized variational mode decomposition to extract decomposed signals capturing different frequency bands of the original battery signals. It also incorporates a multi-stage learning process to analyze both spatial and temporal battery features effectively. An experimental study is conducted with a public battery aging dataset. OSL demonstrates exceptional performance with a mean error of just 0.26%. It significantly outperforms comparison algorithms, both those without and those with suboptimal signal decomposition and analysis. OSL considers practical battery challenges and can be integrated into real-world battery management systems, offering a good impact on battery monitoring and optimization.
- North America > United States (0.15)
- Asia > Singapore (0.06)
- Energy > Energy Storage (1.00)
- Electrical Industrial Apparatus (1.00)
Hybrid CNN-LSTM based Indoor Pedestrian Localization with CSI Fingerprint Maps
The paper presents a novel Wi-Fi fingerprinting system that uses Channel State Information (CSI) data for fine-grained pedestrian localization. The proposed system exploits the frequency diversity and spatial diversity of the features extracted from CSI data to generate a 2D+channel image termed as a CSI Fingerprint Map. We then use this CSI Fingerprint Map representation of CSI data to generate a pedestrian trajectory hypothesis using a hybrid architecture that combines a Convolutional Neural Network and a Long Short-Term Memory Recurrent Neural Network model. The proposed architecture exploits the temporal and spatial relationship information among the CSI data observations gathered at neighboring locations. A particle filter is then employed to separate out the most likely hypothesis matching a human walk model. The experimental performance of our method is compared to existing deep learning localization methods such ConFi, DeepFi and to a self-developed temporal-feature based LSTM based location classifier. The experimental results show marked improvement with an average RMSE of 0.36 m in a moderately dynamic and 0.17 m in a static environment. Our method is essentially a proof of concept that with (1) sparse availability of observations, (2) limited infrastructure requirements, (3) moderate level of short-term and long-term noise in the training and testing environment, reliable fine-grained Wi-Fi based pedestrian localization is a potential option.
- North America > United States > Texas > Brazos County > College Station (0.14)
- North America > United States > Missouri > Jackson County > Kansas City (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (5 more...)
Machine Learning-based sEMG Signal Classification for Hand Gesture Recognition
Aarotale, Parshuram N., Rattani, Ajita
EMG-based hand gesture recognition uses electromyographic~(EMG) signals to interpret and classify hand movements by analyzing electrical activity generated by muscle contractions. It has wide applications in prosthesis control, rehabilitation training, and human-computer interaction. Using electrodes placed on the skin, the EMG sensor captures muscle signals, which are processed and filtered to reduce noise. Numerous feature extraction and machine learning algorithms have been proposed to extract and classify muscle signals to distinguish between various hand gestures. This paper aims to benchmark the performance of EMG-based hand gesture recognition using novel feature extraction methods, namely, fused time-domain descriptors, temporal-spatial descriptors, and wavelet transform-based features, combined with the state-of-the-art machine and deep learning models. Experimental investigations on the Grabmyo dataset demonstrate that the 1D Dilated CNN performed the best with an accuracy of $97\%$ using fused time-domain descriptors such as power spectral moments, sparsity, irregularity factor and waveform length ratio. Similarly, on the FORS-EMG dataset, random forest performed the best with an accuracy of $94.95\%$ using temporal-spatial descriptors (which include time domain features along with additional features such as coefficient of variation (COV), and Teager-Kaiser energy operator (TKEO)).
- North America > United States > Texas (0.14)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Health & Medicine > Therapeutic Area > Neurology (0.68)
- Health & Medicine > Health Care Technology (0.68)
A Comparative Study of Convolutional and Recurrent Neural Networks for Storm Surge Prediction in Tampa Bay
Ghahfarokhi, Mandana Farhang, Sonbolestan, Seyed Hossein, Zamanizadeh, Mahta
In this paper, we compare the performance of three common deep learning architectures, CNN-LSTM, LSTM, and 3D-CNN, in the context of surrogate storm surge modeling. The study site for this paper is the Tampa Bay area in Florida. Using high-resolution atmospheric data from the reanalysis models and historical water level data from NOAA tide stations, we trained and tested these models to evaluate their performance. Our findings indicate that the CNN-LSTM model outperforms the other architectures, achieving a test loss of 0.010 and an R-squared (R2) score of 0.84. The LSTM model, although it achieved the lowest training loss of 0.007 and the highest training R2 of 0.88, exhibited poorer generalization with a test loss of 0.014 and an R2 of 0.77. The 3D-CNN model showed reasonable performance with a test loss of 0.011 and an R2 of 0.82 but displayed instability under extreme conditions. A case study on Hurricane Ian, which caused a significant negative surge of -1.5 meters in Tampa Bay indicates the CNN-LSTM model's robustness and accuracy in extreme scenarios.
- North America > United States > Virginia > Norfolk City County > Norfolk (0.05)
- North America > United States > Florida > Pinellas County > St. Petersburg (0.05)
- North America > Mexico (0.05)
- (2 more...)