Goto

Collaborating Authors

 bi-lstm


Improving Wi-Fi Network Performance Prediction with Deep Learning Models

arXiv.org Artificial Intelligence

Abstract--The increasing need for robustness, reliability, and determinism in wireless networks for industrial and mission-critical applications is the driver for the growth of new innovative methods. The study presented in this work makes use of machine learning techniques to predict channel quality in a Wi-Fi network in terms of the frame delivery ratio. Predictions can be used proactively to adjust communication parameters at runtime and optimize network operations for industrial applications. Methods including convolutional neural networks and long short-term memory were analyzed on datasets acquired from a real Wi-Fi setup across multiple channels. The models were compared in terms of prediction accuracy and computational complexity. Results show that the frame delivery ratio can be reliably predicted, and convolutional neural networks, although slightly less effective than other models, are more efficient in terms of CPU usage and memory consumption. This enhances the model's usability on embedded and industrial systems. Robustness and dependability are the main challenges in next-generation communication systems, especially in wireless networks for industrial applications like Wi-Fi [1], but also in the context of smart cities and buildings, transportation, and agriculture.



L-GTA: Latent Generative Modeling for Time Series Augmentation

arXiv.org Artificial Intelligence

Data augmentation is gaining importance across various aspects of time series analysis, from forecasting to classification and anomaly detection tasks. We introduce the Latent Generative Transformer Augmentation (L-GTA) model, a generative approach using a transformer-based variational recurrent autoencoder. This model uses controlled transformations within the latent space of the model to generate new time series that preserve the intrinsic properties of the original dataset. L-GTA enables the application of diverse transformations, ranging from simple jittering to magnitude warping, and combining these basic transformations to generate more complex synthetic time series datasets. Our evaluation of several real-world datasets demonstrates the ability of L-GTA to produce more reliable, consistent, and controllable augmented data. This translates into significant improvements in predictive accuracy and similarity measures compared to direct transformation methods.


IoT Malware Network Traffic Detection using Deep Learning and GraphSAGE Models

arXiv.org Artificial Intelligence

This paper intends to detect IoT malicious attacks through deep learning models and demonstrates a comprehensive evaluation of the deep learning and graph-based models regarding malicious network traffic detection. The models particularly are based on GraphSAGE, Bidirectional encoder representations from transformers (BERT), Temporal Convolutional Network (TCN) as well as Multi-Head Attention, together with Bidirectional Long Short-Term Memory (BI-LSTM) Multi-Head Attention and BI-LSTM and LSTM models. The chosen models demonstrated great performance to model temporal patterns and detect feature significance. The observed performance are mainly due to the fact that IoT system traffic patterns are both sequential and diverse, leaving a rich set of temporal patterns for the models to learn. Experimental results showed that BERT maintained the best performance. It achieved 99.94% accuracy rate alongside high precision and recall, F1-score and AUC-ROC score of 99.99% which demonstrates its capabilities through temporal dependency capture. The Multi-Head Attention offered promising results by providing good detection capabilities with interpretable results. On the other side, the Multi-Head Attention model required significant processing time like BI-LSTM variants. The GraphSAGE model achieved good accuracy while requiring the shortest training time but yielded the lowest accuracy, precision, and F1 score compared to the other models


Network evasion detection with Bi-LSTM model

arXiv.org Artificial Intelligence

Network evasion is a way to disguise data traffic by confusing network intrusion detection systems. Network evasion detection is designed to distinguish whether a network traffic from the link layer poses a threat to the network or not. At present, the traditional network evasion detection method does not extract the characteristics of network traffic and the detection accuracy is relatively low. In this paper, a novel network evasion detection framework has been proposed to detect eight atomic evasion behaviors which are based on deep recurrent neural network. Firstly, inter-packet and intra-packet features are extracted from network traces. Then a bidirectional long short-term memory (Bi-LSTM) neural network is trained to encode both the past and the future traits of the network traces. Finally, on the top of the Bi-LSTM network, a Softmax layer is used to classify the trace into the correct evasion class. The experimental results show that the average detection accuracy of the framework reaches 96.1%.


Multi-Lingual Cyber Threat Detection in Tweets/X Using ML, DL, and LLM: A Comparative Analysis

arXiv.org Artificial Intelligence

Cyber threat detection has become an important area of focus in today's digital age due to the growing spread of fake information and harmful content on social media platforms such as Twitter (now 'X'). These cyber threats, often disguised within tweets, pose significant risks to individuals, communities, and even nations, emphasizing the need for effective detection systems. While previous research has explored tweet-based threats, much of the work is limited to specific languages, domains, or locations, or relies on single-model approaches, reducing their applicability to diverse real-world scenarios. To address these gaps, our study focuses on multi-lingual tweet cyber threat detection using a variety of advanced models. The research was conducted in three stages: (1) We collected and labeled tweet datasets in four languages English, Chinese, Russian, and Arabic employing both manual and polarity-based labeling methods to ensure high-quality annotations. (2) Each dataset was analyzed individually using machine learning (ML) and deep learning (DL) models to assess their performance on distinct languages. (3) Finally, we combined all four datasets into a single multi-lingual dataset and applied DL and large language model (LLM) architectures to evaluate their efficacy in identifying cyber threats across various languages. Our results show that among machine learning models, Random Forest (RF) attained the highest performance; however, the Bi-LSTM architecture consistently surpassed other DL and LLM architectures across all datasets. These findings underline the effectiveness of Bi-LSTM in multilingual cyber threat detection. The code for this paper can be found at this link: https://github.com/Mmurrad/Tweet-Data-Classification.git.


Intra-day Solar and Power Forecast for Optimization of Intraday Market Participation

arXiv.org Artificial Intelligence

The prediction of solar irradiance enhances reliability in photovoltaic (PV) solar plant generation and grid integration. In Colombia, PV plants face penalties if energy production deviates beyond governmental thresholds from intraday market offers. This research employs Long Short-Term Memory (LSTM) and Bidirectional-LSTM (Bi-LSTM) models, utilizing meteorological data from a PV plant in El Paso, Cesar, Colombia, to predict solar irradiance with a 6-hour horizon and 10-minute resolution. While Bi-LSTM showed superior performance, the LSTM model achieved comparable results with significantly reduced training time (6 hours versus 18 hours), making it computationally advantageous. The LSTM predictions were averaged to create an hourly resolution model, evaluated using Mean Absolute Error, Root-Mean-Square Error, Normalized Root-Mean-Square Error, and Mean Absolute Percentage Error metrics. Comparison with the Global Forecast System (GFS) revealed similar performance, with both models effectively capturing daily solar irradiance patterns. The forecast model integrates with an Object-Oriented power production model, enabling accurate energy offers in the intraday market while minimizing penalty costs.


CryptoMamba: Leveraging State Space Models for Accurate Bitcoin Price Prediction

arXiv.org Artificial Intelligence

Predicting Bitcoin price remains a challenging problem due to the high volatility and complex non-linear dynamics of cryptocurrency markets. Traditional time-series models, such as ARIMA and GARCH, and recurrent neural networks, like LSTMs, have been widely applied to this task but struggle to capture the regime shifts and long-range dependencies inherent in the data. In this work, we propose CryptoMamba, a novel Mamba-based State Space Model (SSM) architecture designed to effectively capture long-range dependencies in financial time-series data. Our experiments show that CryptoMamba not only provides more accurate predictions but also offers enhanced generalizability across different market conditions, surpassing the limitations of previous models. Coupled with trading algorithms for real-world scenarios, CryptoMamba demonstrates its practical utility by translating accurate forecasts into financial outcomes. Our findings signal a huge advantage for SSMs in stock and cryptocurrency price forecasting tasks.


Utilizing RNN for Real-time Cryptocurrency Price Prediction and Trading Strategy Optimization

arXiv.org Machine Learning

This study explores the use of Recurrent Neural Networks (RNN) for real-time cryptocurrency price prediction and optimized trading strategies. Given the high volatility of the cryptocurrency market, traditional forecasting models often fall short. By leveraging RNNs' capability to capture long-term patterns in time-series data, this research aims to improve accuracy in price prediction and develop effective trading strategies. The project follows a structured approach involving data collection, preprocessing, and model refinement, followed by rigorous backtesting for profitability and risk assessment. This work contributes to both the academic and practical fields by providing a robust predictive model and optimized trading strategies that address the challenges of cryptocurrency trading.


Improving Extraction of Clinical Event Contextual Properties from Electronic Health Records: A Comparative Study

arXiv.org Artificial Intelligence

Electronic Health Records are large repositories of valuable clinical data, with a significant portion stored in unstructured text format. This textual data includes clinical events (e.g., disorders, symptoms, findings, medications and procedures) in context that if extracted accurately at scale can unlock valuable downstream applications such as disease prediction. Using an existing Named Entity Recognition and Linking methodology, MedCAT, these identified concepts need to be further classified (contextualised) for their relevance to the patient, and their temporal and negated status for example, to be useful downstream. This study performs a comparative analysis of various natural language models for medical text classification. Extensive experimentation reveals the effectiveness of transformer-based language models, particularly BERT. When combined with class imbalance mitigation techniques, BERT outperforms Bi-LSTM models by up to 28% and the baseline BERT model by up to 16% for recall of the minority classes. The method has been implemented as part of CogStack/MedCAT framework and made available to the community for further research.