AITopics | Vlachas, Pantelis R.

Collaborating Authors

Vlachas, Pantelis R.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deconstructing Recurrence, Attention, and Gating: Investigating the transferability of Transformers and Gated Recurrent Neural Networks in forecasting of dynamical systems

Heidenreich, Hunter S., Vlachas, Pantelis R., Koumoutsakos, Petros

arXiv.org Artificial IntelligenceOct-3-2024

Machine learning architectures, including transformers and recurrent neural networks (RNNs) have revolutionized forecasting in applications ranging from text processing to extreme weather. Notably, advanced network architectures, tuned for applications such as natural language processing, are transferable to other tasks such as spatiotemporal forecasting tasks. However, there is a scarcity of ablation studies to illustrate the key components that enable this forecasting accuracy. The absence of such studies, although explainable due to the associated computational cost, intensifies the belief that these models ought to be considered as black boxes. In this work, we decompose the key architectural components of the most powerful neural architectures, namely gating and recurrence in RNNs, and attention mechanisms in transformers. Then, we synthesize and build novel hybrid architectures from the standard blocks, performing ablation studies to identify which mechanisms are effective for each task. The importance of considering these components as hyper-parameters that can augment the standard architectures is exhibited on various forecasting datasets, from the spatiotemporal chaotic dynamics of the multiscale Lorenz 96 system, the Kuramoto-Sivashinsky equation, as well as standard real world time-series benchmarks. A key finding is that neural gating and attention improves the performance of all standard RNNs in most tasks, while the addition of a notion of recurrence in transformers is detrimental. Furthermore, our study reveals that a novel, sparsely used, architecture which integrates Recurrent Highway Networks with neural gating and attention mechanisms, emerges as the best performing architecture in high-dimensional spatiotemporal forecasting of dynamical systems.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.02654

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Massachusetts (0.14)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Energy (1.00)
Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

RefreshNet: Learning Multiscale Dynamics through Hierarchical Refreshing

Farooq, Junaid, Rafiq, Danish, Vlachas, Pantelis R., Bazaz, Mohammad Abid

arXiv.org Artificial IntelligenceJan-24-2024

Forecasting complex system dynamics, particularly for long-term predictions, is persistently hindered by error accumulation and computational burdens. This study presents RefreshNet, a multiscale framework developed to overcome these challenges, delivering an unprecedented balance between computational efficiency and predictive accuracy. RefreshNet incorporates convolutional autoencoders to identify a reduced order latent space capturing essential features of the dynamics, and strategically employs multiple recurrent neural network (RNN) blocks operating at varying temporal resolutions within the latent space, thus allowing the capture of latent dynamics at multiple temporal scales. The unique "refreshing" mechanism in RefreshNet allows coarser blocks to reset inputs of finer blocks, effectively controlling and alleviating error accumulation. This design demonstrates superiority over existing techniques regarding computational efficiency and predictive accuracy, especially in long-term forecasting. The framework is validated using three benchmark applications: the FitzHugh-Nagumo system, the Reaction-Diffusion equation, and Kuramoto-Sivashinsky dynamics. RefreshNet significantly outperforms state-of-the-art methods in long-term forecasting accuracy and speed, marking a significant advancement in modeling complex systems and opening new avenues in understanding and predicting their behavior.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Artificial Intelligence

2401.13282

Country:

Asia > India (0.28)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.48)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adaptive learning of effective dynamics: Adaptive real-time, online modeling for complex systems

Kičić, Ivica, Vlachas, Pantelis R., Arampatzis, Georgios, Chatzimanolakis, Michail, Guibas, Leonidas, Koumoutsakos, Petros

arXiv.org Artificial IntelligenceApr-4-2023

Predictive simulations are essential for applications ranging from weather forecasting to material design. The veracity of these simulations hinges on their capacity to capture the effective system dynamics. Massively parallel simulations predict the systems dynamics by resolving all spatiotemporal scales, often at a cost that prevents experimentation. On the other hand, reduced order models are fast but often limited by the linearization of the system dynamics and the adopted heuristic closures. We propose a novel systematic framework that bridges large scale simulations and reduced order models to extract and forecast adaptively the effective dynamics (AdaLED) of multiscale systems. AdaLED employs an autoencoder to identify reduced-order representations of the system dynamics and an ensemble of probabilistic recurrent neural networks (RNNs) as the latent time-stepper. The framework alternates between the computational solver and the surrogate, accelerating learned dynamics while leaving yet-to-be-learned dynamics regimes to the original solver. AdaLED continuously adapts the surrogate to the new dynamics through online training. The transitions between the surrogate and the computational solver are determined by monitoring the prediction accuracy and uncertainty of the surrogate. The effectiveness of AdaLED is demonstrated on three different systems - a Van der Pol oscillator, a 2D reaction-diffusion equation, and a 2D Navier-Stokes flow past a cylinder for varying Reynolds numbers (400 up to 1200), showcasing its ability to learn effective dynamics online, detect unseen dynamics regimes, and provide net speed-ups. To the best of our knowledge, AdaLED is the first framework that couples a surrogate model with a computational solver to achieve online adaptive learning of effective dynamics. It constitutes a potent tool for applications requiring many expensive simulations.

artificial intelligence, machine learning, simulation, (17 more...)

arXiv.org Artificial Intelligence

2304.01732

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Energy > Oil & Gas > Upstream (0.66)
Education > Educational Setting > Online (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning from Predictions: Fusing Training and Autoregressive Inference for Long-Term Spatiotemporal Forecasts

Vlachas, Pantelis R., Koumoutsakos, Petros

arXiv.org Artificial IntelligenceFeb-21-2023

Recurrent Neural Networks (RNNs) have become an integral part of modeling and forecasting frameworks in areas like natural language processing and high-dimensional dynamical systems such as turbulent fluid flows. To improve the accuracy of predictions, RNNs are trained using the Backpropagation Through Time (BPTT) method to minimize prediction loss. During testing, RNNs are often used in autoregressive scenarios where the output of the network is fed back into the input. However, this can lead to the exposure bias effect, as the network was trained to receive ground-truth data instead of its own predictions. This mismatch between training and testing is compounded when the state distributions are different, and the train and test losses are measured. To address this, previous studies have proposed solutions for language processing networks with probabilistic predictions. Building on these advances, we propose the Scheduled Autoregressive BPTT (BPTT-SA) algorithm for predicting complex systems. Our results show that BPTT-SA effectively reduces iterative error propagation in Convolutional RNNs and Convolutional Autoencoder RNNs, and demonstrate its capabilities in long-term prediction of high-dimensional fluid flows.

machine learning, natural language, prediction, (16 more...)

arXiv.org Artificial Intelligence

2302.11101

Country:

North America > United States (0.28)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Improved Memories Learning

Varoli, Francesco, Novati, Guido, Vlachas, Pantelis R., Koumoutsakos, Petros

arXiv.org Machine LearningAug-24-2020

We propose Improved Memories Learning (IMeL), a novel algorithm that turns reinforcement learning (RL) into a supervised learning (SL) problem and delimits the role of neural networks (NN) to interpolation. IMeL consists of two components. The first is a reservoir of experiences. Each experience is updated based on a non-parametric procedural improvement of the policy, computed as a bounded one-sample Monte Carlo estimate. The second is a NN regressor, which receives as input improved experiences from the reservoir (context points) and computes the policy by interpolation. The NN learns to measure the similarity between states in order to compute long-term forecasts by averaging experiences, rather than by encoding the problem structure in the NN parameters. We present preliminary results and propose IMeL as a baseline method for assessing the merits of more complex models and inductive biases.

algorithm, artificial intelligence, reinforcement learning, (16 more...)

arXiv.org Machine Learning

2008.10433

Country: Europe > Switzerland > Zürich > Zürich (0.15)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback