AITopics | lstm layer

Collaborating Authors

lstm layer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep Dictionary-Free Method for Identifying Linear Model of Nonlinear System with Input Delay

Valábek, Patrik, Wadinger, Marek, Kvasnica, Michal, Klaučo, Martin

arXiv.org Artificial IntelligenceNov-7-2025

Nonlinear dynamical systems with input delays pose significant challenges for prediction, estimation, and control due to their inherent complexity and the impact of delays on system behavior. Traditional linear control techniques often fail in these contexts, necessitating innovative approaches. This paper introduces a novel approach to approximate the Koopman operator using an LSTM-enhanced Deep Koopman model, enabling linear representations of nonlinear systems with time delays. By incorporating Long Short-Term Memory (LSTM) layers, the proposed framework captures historical dependencies and efficiently encodes time-delayed system dynamics into a latent space. Unlike traditional extended Dynamic Mode Decomposition (eDMD) approaches that rely on predefined dictionaries, the LSTM-enhanced Deep Koopman model is dictionary-free, which mitigates the problems with the underlying dynamics being known and incorporated into the dictionary. Quantitative comparisons with extended eDMD on a simulated system demonstrate highly significant performance gains in prediction accuracy in cases where the true nonlinear dynamics are unknown and achieve comparable results to eDMD with known dynamics of a system.

artificial intelligence, edmd, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2511.04451

Country:

Europe > Slovakia > Bratislava > Bratislava (0.05)
Europe > Czechia > Prague (0.04)

Genre:

Research Report > Promising Solution (0.68)
Overview > Innovation (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adaptive Online Learning with LSTM Networks for Energy Price Prediction

Salihoglu, Salih, Ahmed, Ibrahim, Asadi, Afshin

arXiv.org Artificial IntelligenceOct-21-2025

Accurate prediction of electricity prices is crucial for stakeholders in the energy market, particularly for grid operators, energy producers, and consumers. This study focuses on developing a predictive model leveraging Long Short-Term Memory (LSTM) networks to forecast day-ahead electricity prices in the California energy market. The model incorporates a variety of features, including historical price data, weather conditions, and the energy generation mix. A novel custom loss function that integrates Mean Absolute Error (MAE), Jensen-Shannon Divergence (JSD), and a smoothness penalty is introduced to enhance the prediction accuracy and interpretability. Additionally, an online learning approach is implemented to allow the model to adapt to new data incrementally, ensuring continuous relevance and accuracy. The results demonstrate that the custom loss function can improve the model's performance, aligning predicted prices more closely with actual values, particularly during peak intervals. Also, the online learning model outperforms other models by effectively incorporating real-time data, resulting in lower prediction error and variability. The inclusion of the energy generation mix further enhances the model's predictive capabilities, highlighting the importance of comprehensive feature integration. This research provides a robust framework for electricity price forecasting, offering valuable insights and tools for better decision-making in dynamic electricity markets.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Artificial Intelligence

2510.16898

Country:

North America > United States > California (0.34)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
Oceania > Australia (0.04)
(8 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Energy > Power Industry (1.00)
Education > Educational Setting > Online (0.85)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

High Cycle S-N curve prediction for Al 7075-T6 alloy using Recurrent Neural Networks (RNNs)

Patel, Aryan

arXiv.org Artificial IntelligenceOct-7-2025

Aluminum is a widely used alloy, which is susceptible to fatigue failure. Characterizing fatigue performance for materials is extremely time and cost demanding, especially for high cycle data. To help mitigate this, a transfer learning based framework has been developed using Long short-term memory networks (LSTMs) in which a source LSTM model is trained based on pure axial fatigue data for Aluminum 7075-T6 alloy which is then transferred to predict high cycle torsional S-N curves. The framework was able to accurately predict Al torsional S-N curves for a much higher cycle range. It is the belief that this framework will help to drastically mitigate the cost of gathering fatigue characteristics for different materials and help prioritize tests with better cost and time constraints.

artificial intelligence, machine learning, neural network, (15 more...)

arXiv.org Artificial Intelligence

2510.03355

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Japan (0.04)

Genre: Research Report (0.50)

Industry: Materials > Metals & Mining > Aluminum (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Comprehensive Attribute Encoding and Dynamic LSTM HyperModels for Outcome Oriented Predictive Business Process Monitoring

Wang, Fang, Ceravolo, Paolo, Damiani, Ernesto

arXiv.org Artificial IntelligenceAug-6-2025

--Predictive Business Process Monitoring (PBPM) aims to forecast future outcomes of ongoing business processes. However, existing methods often lack flexibility to handle real-world challenges such as simultaneous events, class imbalance, and multi-level attributes. While prior work has explored static encoding schemes and fixed LSTM architectures, they struggle to support adaptive representations and generalize across heterogeneous datasets. T o address these limitations, we propose a suite of dynamic LSTM HyperModels that integrate two-level hierarchical encoding for event and sequence attributes, character-based decomposition of event labels, and novel pseudo-embedding techniques for durations and attribute correlations. We further introduce specialized LSTM variants for simultaneous event modeling, leveraging multidimensional embeddings and time-difference flag augmentation. Experimental validation on four public and real-world datasets demonstrates up to 100% accuracy on balanced datasets and F1 scores exceeding 86% on imbalanced ones. Our approach advances PBPM by offering modular and interpretable models better suited for deployment in complex settings. Beyond PBPM, it contributes to the broader AI community by improving temporal outcome prediction, supporting data heterogeneity, and promoting explainable process intelligence frameworks. Impact Statement --Business processes underpin daily operations across healthcare, finance, public services, and logistics. Predicting the outcome of ongoing processes--such as whether a loan will be approved or a shipment delayed--can save time, reduce costs, and improve service. Our work introduces adaptive, interpretable models that overcome these hurdles, making accurate predictions in more realistic settings.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2506.03696

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Europe > Austria > Vienna (0.14)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
(9 more...)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Prediction of Lane Change Intentions of Human Drivers using an LSTM, a CNN and a Transformer

De Cristofaro, Francesco, Hofbaur, Felix, Yang, Aixi, Eichberger, Arno

arXiv.org Artificial IntelligenceJul-14-2025

Lane changes of preceding vehicles have a great impact on the motion planning of automated vehicles especially in complex traffic situations. Predicting them would benefit the public in terms of safety and efficiency. While many research efforts have been made in this direction, few concentrated on predicting maneuvers within a set time interval compared to predicting at a set prediction time. In addition, there exist a lack of comparisons between different architectures to try to determine the best performing one and to assess how to correctly choose the input for such models. In this paper the structure of an LSTM, a CNN and a Transformer network are described and implemented to predict the intention of human drivers to perform a lane change. We show how the data was prepared starting from a publicly available dataset (highD), which features were used, how the networks were designed and finally we compare the results of the three networks with different configurations of input data. We found that transformer networks performed better than the other networks and was less affected by overfitting. The accuracy of the method spanned from $82.79\%$ to $96.73\%$ for different input configurations and showed overall good performances considering also precision and recall.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2507.08365

Country:

North America > United States (0.15)
Europe > Austria > Styria > Graz (0.05)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > China > Zhejiang Province (0.04)

Genre: Research Report (0.50)

Industry:

Automobiles & Trucks (0.93)
Transportation (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Enhancing Learning Path Recommendation via Multi-task Learning

Nasrin, Afsana, Qian, Lijun, Obiomon, Pamela, Dong, Xishuang

arXiv.org Artificial IntelligenceJul-9-2025

Personalized learning is a student-centered educational approach that adapts content, pace, and assessment to meet each learner's unique needs. As the key technique to implement the personalized learning, learning path recommendation sequentially recommends personalized learning items such as lectures and exercises. Advances in deep learning, particularly deep reinforcement learning, have made modeling such recommendations more practical and effective. This paper proposes a multi-task LSTM model that enhances learning path recommendation by leveraging shared information across tasks. The approach reframes learning path recommendation as a sequence-to-sequence (Seq2Seq) prediction problem, generating personalized learning paths from a learner's historical interactions. The model uses a shared LSTM layer to capture common features for both learning path recommendation and deep knowledge tracing, along with task-specific LSTM layers for each objective. To avoid redundant recommendations, a non-repeat loss penalizes repeated items within the recommended learning path. Experiments on the ASSIST09 dataset show that the proposed model significantly outperforms baseline methods for the learning path recommendation.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2507.05295

Country:

North America > United States > Texas > Waller County > Prairie View (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

FINN-GL: Generalized Mixed-Precision Extensions for FPGA-Accelerated LSTMs

Khandelwal, Shashwat, Petri-Koenig, Jakoba, Preußer, Thomas B., Blott, Michaela, Shanker, Shreejith

arXiv.org Artificial IntelligenceJun-27-2025

--Recurrent neural networks (RNNs), particularly LSTMs, are effective for time-series tasks like sentiment analysis and short-term stock prediction. However, their computational complexity poses challenges for real-time deployment in resource constrained environments. While FPGAs offer a promising platform for energy-efficient AI acceleration, existing tools mainly target feed-forward networks, and LSTM acceleration typically requires full custom implementation. In this paper, we address this gap by leveraging the open-source and extensible FINN framework to enable the generalized deployment of LSTMs on FPGAs. Specifically, we leverage the Scan operator from the Open Neural Network Exchange (ONNX) specification to model the recurrent nature of LSTM computations, enabling support for mixed quantisation within them and functional verification of LSTM-based models. Furthermore, we introduce custom transformations within the FINN compiler to map the quantised ONNX computation graph to hardware blocks from the HLS kernel library of the FINN compiler and Vitis HLS. We validate the proposed tool-flow by training a quantised ConvLSTM model for a mid-price stock prediction task using the widely used dataset and generating a corresponding hardware IP of the model using our flow, targeting the XCZU7EV device. We show that the generated quantised ConvLSTM accelerator through our flow achieves a balance between performance (latency) and resource consumption, while matching (or bettering) inference accuracy of state-of-the-art models with reduced precision. We believe that the generalisable nature of the proposed flow will pave the way for resource-efficient RNN accelerator designs on FPGAs. Time-series predictions are increasingly gaining recognition for their ability to allow stakeholders to dynamically optimise resource allocation in real-time, in applications such as energy forecasting and stock market trend prediction, among others. The generalisability of machine learning (ML) models, coupled with the increasing availability of high-quality training data, has led to significant improvements in prediction accuracy compared to traditional hand-tuned algorithms [1].

artificial intelligence, machine learning, operator, (19 more...)

arXiv.org Artificial Intelligence

2506.2081

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.14)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Predicting E-commerce Purchase Behavior using a DQN-Inspired Deep Learning Model for enhanced adaptability

Jain, Aditi Madhusudan

arXiv.org Artificial IntelligenceJun-24-2025

--This paper presents a novel approach to predicting buying intent and product demand in e-commerce settings, leveraging a Deep Q-Network (DQN) inspired architecture. In the rapidly evolving landscape of online retail, accurate prediction of user behavior is crucial for optimizing inventory management, personalizing user experiences, and maximizing sales. We evaluate our model on a large-scale e-commerce dataset comprising over 885,000 user sessions, each characterized by 1,114 features. Our approach demonstrates robust performance in handling the inherent class imbalance typical in e-commerce data, where purchase events are significantly less frequent than non-purchase events. Through comprehensive experimentation with various classification thresholds, we show that our model achieves a balance between precision and recall, with an overall accuracy of 88% and an AUC-ROC score of 0.88. Comparative analysis reveals that our DQN-inspired model offers advantages over traditional machine learning and standard deep learning approaches, particularly in its ability to capture complex temporal patterns in user behavior . This research contributes to the field of e-commerce analytics by introducing a novel predictive modeling technique that combines the strengths of deep learning and reinforcement learning paradigms. Our findings have significant implications for improving demand forecasting, personalizing user experiences, and optimizing marketing strategies in online retail environments. The e-commerce industry has experienced unprecedented growth in recent years, with global sales projected to reach $6.3 trillion by 2024 [1].

artificial intelligence, machine learning, threshold, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.17762/ijisae.v13i1s.7419

2506.17543

Genre: Research Report > New Finding (1.00)

Industry:

Retail (1.00)
Information Technology > Services > e-Commerce Services (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Imitation Learning for Autonomous Driving: Insights from Real-World Testing

Dursun, Hidayet Ersin, Güven, Yusuf, Kumbasar, Tufan

arXiv.org Artificial IntelligenceApr-29-2025

This work focuses on the design of a deep learning-based autonomous driving system deployed and tested on the real-world MIT Racecar to assess its effectiveness in driving scenarios. The Deep Neural Network (DNN) translates raw image inputs into real-time steering commands in an end-to-end learning fashion, following the imitation learning framework. The key design challenge is to ensure that DNN predictions are accurate and fast enough, at a high sampling frequency, and result in smooth vehicle operation under different operating conditions. In this study, we design and compare various DNNs, to identify the most effective approach for real-time autonomous driving. In designing the DNNs, we adopted an incremental design approach that involved enhancing the model capacity and dataset to address the challenges of real-world driving scenarios. We designed a PD system, CNN, CNN-LSTM, and CNN-NODE, and evaluated their performance on the real-world MIT Racecar. While the PD system handled basic lane following, it struggled with sharp turns and lighting variations. The CNN improved steering but lacked temporal awareness, which the CNN-LSTM addressed as it resulted in smooth driving performance. The CNN-NODE performed similarly to the CNN-LSTM in handling driving dynamics, yet with slightly better driving performance. The findings of this research highlight the importance of iterative design processes in developing robust DNNs for autonomous driving applications. The experimental video is available at https://www.youtube.com/watch?v=FNNYgU--iaY.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2504.18847

Country:

Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
Asia > Japan > Honshū > Kansai > Hyogo Prefecture > Kobe (0.04)

Genre: Research Report > New Finding (0.87)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)
Information Technology > Robotics & Automation (0.91)
Leisure & Entertainment > Sports > Motorsports (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Infinity-norm-based Input-to-State-Stable Long Short-Term Memory networks: a thermal systems perspective

De Carli, Stefano, Previtali, Davide, Pitturelli, Leandro, Ferramosca, Antonio, Previdi, Fabio

arXiv.org Machine LearningMar-14-2025

Recurrent Neural Networks (RNNs) have shown remarkable performances in system identification, particularly in nonlinear dynamical systems such as thermal processes. However, stability remains a critical challenge in practical applications: although the underlying process may be intrinsically stable, there may be no guarantee that the resulting RNN model captures this behavior. This paper addresses the stability issue by deriving a sufficient condition for Input-to-State Stability based on the infinity-norm (ISS$_{\infty}$) for Long Short-Term Memory (LSTM) networks. The obtained condition depends on fewer network parameters compared to prior works. A ISS$_{\infty}$-promoted training strategy is developed, incorporating a penalty term in the loss function that encourages stability and an ad hoc early stopping approach. The quality of LSTM models trained via the proposed approach is validated on a thermal system case study, where the ISS$_{\infty}$-promoted LSTM outperforms both a physics-based model and an ISS$_{\infty}$-promoted Gated Recurrent Unit (GRU) network while also surpassing non-ISS$_{\infty}$-promoted LSTM and GRU RNNs.

artificial intelligence, iss, machine learning, (17 more...)

arXiv.org Machine Learning

2503.11553

Genre: Research Report (0.64)

Industry:

Energy > Oil & Gas (0.47)
Energy > Renewable (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback