Goto

Collaborating Authors

 crnn




SPIN-ODE: Stiff Physics-Informed Neural ODE for Chemical Reaction Rate Estimation

Peng, Wenqing, Liu, Zhi-Song, Boy, Michael

arXiv.org Artificial Intelligence

Estimating rate coefficients from complex chemical reactions is essential for advancing detailed chemistry. However, the stiffness inherent in real-world atmospheric chemistry systems poses severe challenges, leading to training instability and poor convergence, which hinder effective rate coefficient estimation using learning-based approaches. To address this, we propose a Stiff Physics-Informed N eural ODE framework (SPIN-ODE) for chemical reaction modelling. Our method introduces a three-stage optimisation process: first, a black-box neural ODE is trained to fit concentration trajectories; second, a Chemical Reaction Neural Network (CRNN) is pre-trained to learn the mapping between concentrations and their time derivatives; and third, the rate coefficients are fine-tuned by integrating with the pre-trained CRNN. Extensive experiments on both synthetic and newly proposed real-world datasets validate the effectiveness and robustness of our approach. As the first work addressing stiff neural ODE for chemical rate coefficient discovery, our study opens promising directions for integrating neural networks with detailed chemistry.


Scalable Equilibrium Propagation via Intermediate Error Signals for Deep Convolutional CRNNs

Lin, Jiaqi, Bal, Malyaban, Sengupta, Abhronil

arXiv.org Artificial Intelligence

Equilibrium Propagation (EP) is a biologically inspired local learning rule first proposed for convergent recurrent neural networks (CRNNs), in which synaptic updates depend only on neuron states from two distinct phases. EP estimates gradients that closely align with those computed by Backpropaga-tion Through Time (BPTT) while significantly reducing computational demands, positioning it as a potential candidate for on-chip training in neuromorphic architectures. However, prior studies on EP have been constrained to shallow architectures, as deeper networks suffer from the vanishing gradient problem, leading to convergence difficulties in both energy minimization and gradient computation. To address the vanishing gradient problem in deep EP networks, we propose a novel EP framework that incorporates intermediate error signals to enhance information flow and convergence of neuron dynamics. This is the first work to integrate knowledge distillation and local error signals into EP, enabling the training of significantly deeper architectures. Our proposed approach achieves state-of-the-art performance on the CIFAR-10 and CIFAR-100 datasets, showcasing its scalability on deep VGG architectures. These results represent a significant advancement in the scalability of EP, paving the way for its application in real-world systems.


The use of Multi-domain Electroencephalogram Representations in the building of Models based on Convolutional and Recurrent Neural Networks for Epilepsy Detection

Anghinoni, Luiz Antonio Nicolau, Denardin, Gustavo Weber, Gertrudes, Jadson Castro, Casanova, Dalcimar, Oliva, Jefferson Tales

arXiv.org Artificial Intelligence

This important role has led researchers to develop various methods for gathering information about brain activity, resulting in significant advancements in medical signal and image acquisition systems [2]. Among these advancements are functional neuroimaging techniques, such as functional magnetic resonance imaging, magnetoencephalography (MEG), positron emission tomography (PET), and electroencephalography [2]. Among these techniques, electroencephalography stands out due to three key advantages: it is a non-invasive method that allows data generation from any individual, has excellent temporal resolution--effectively capturing events occurring within milliseconds--and is relatively cost-effective compared to other examinations [3]. Electroencephalography monitors the brain's electrical activity through electrodes placed on the scalp, and the resulting data, known as the electroencephalogram (EEG), consists of a time series of electrical potentials that reflect neurological activity [4]. The EEG signal is widely used in the field of neuroscience and has the potential to advance brain-computer interfaces [5], facilitate emotion detection [6], enable classification of sleep stages [7] and help clinicians and researchers in identifying brain diseases, including but not limited to Alzheimer's disease [8], dyslexia [9], schizophrenia [10], Creutzfeldt-Jakob disease [11] and cognitive impairment [12]. Epilepsy, for example, is a neurological disorder characterized by abnormal brain activity that can lead to seizures, unusual behaviors, or even loss of consciousness.


Mamba-based Deep Learning Approaches for Sleep Staging on a Wireless Multimodal Wearable System without Electroencephalography

Zhang, Andrew H., He-Mo, Alex, Yin, Richard Fei, Li, Chunlin, Tang, Yuzhi, Gurve, Dharmendra, Ghahjaverestan, Nasim Montazeri, Goubran, Maged, Wang, Bo, Lim, Andrew S. P.

arXiv.org Artificial Intelligence

Study Objectives: We investigate using Mamba-based deep learning approaches for sleep staging on signals from ANNE One (Sibel Health, Evanston, IL), a minimally intrusive dual-sensor wireless wearable system measuring chest electrocardiography (ECG), triaxial accelerometry, and temperature, as well as finger photoplethysmography (PPG) and temperature. Methods: We obtained wearable sensor recordings from 360 adults undergoing concurrent clinical polysomnography (PSG) at a tertiary care sleep lab. PSG recordings were scored according to AASM criteria. PSG and wearable sensor data were automatically aligned using their ECG channels with manual confirmation by visual inspection. We trained Mamba-based models with both convolutional-recurrent neural network (CRNN) and the recurrent neural network (RNN) architectures on these recordings. Ensembling of model variants with similar architectures was performed. Results: Our best approach, after ensembling, attains a 3-class (wake, NREM, REM) balanced accuracy of 83.50%, F1 score of 84.16%, Cohen's $\kappa$ of 72.68%, and a MCC score of 72.84%; a 4-class (wake, N1/N2, N3, REM) balanced accuracy of 74.64%, F1 score of 74.56%, Cohen's $\kappa$ of 61.63%, and MCC score of 62.04%; a 5-class (wake, N1, N2, N3, REM) balanced accuracy of 64.30%, F1 score of 66.97%, Cohen's $\kappa$ of 53.23%, MCC score of 54.38%. Conclusions: Deep learning models can infer major sleep stages from a wearable system without electroencephalography (EEG) and can be successfully applied to data from adults attending a tertiary care sleep clinic.


From Computation to Consumption: Exploring the Compute-Energy Link for Training and Testing Neural Networks for SED Systems

Douwes, Constance, Serizel, Romain

arXiv.org Artificial Intelligence

The massive use of machine learning models, particularly neural networks, has raised serious concerns about their environmental impact. Indeed, over the last few years we have seen an explosion in the computing costs associated with training and deploying these systems. It is, therefore, crucial to understand their energy requirements in order to better integrate them into the evaluation of models, which has so far focused mainly on performance. In this paper, we study several neural network architectures that are key components of sound event detection systems, using an audio tagging task as an example. We measure the energy consumption for training and testing small to large architectures and establish complex relationships between the energy consumption, the number of floating-point operations, the number of parameters, and the GPU/memory utilization.


Extreme time extrapolation capabilities and thermodynamic consistency of physics-inspired Neural Networks for the 3D microstructure evolution of materials

Lanzoni, Daniele, Fantasia, Andrea, Bergamaschini, Roberto, Pierre-Louis, Olivier, Montalenti, Francesco

arXiv.org Artificial Intelligence

A Convolutional Recurrent Neural Network (CRNN) is trained to reproduce the evolution of the spinodal decomposition process in three dimensions as described by the Cahn-Hilliard equation. A specialized, physics-inspired architecture is proven to provide close accordance between the predicted evolutions and the ground truth ones obtained via conventional integration schemes. The method can closely reproduce the evolution of microstructures not represented in the training set at a fraction of the computational costs. Extremely long-time extrapolation capabilities are achieved, up to reaching the theoretically expected equilibrium state of the system, despite the training set containing only relatively-short, initial phases of the evolution. Quantitative accordance with the decay rate of the Free energy is also demonstrated up to late coarsening stages, providing an example of a data-driven, physically consistent and high-accuracy Machine Learning method for the long timescale simulation of materials.


Computing a human-like reaction time metric from stable recurrent vision models

Goetschalckx, Lore, Govindarajan, Lakshmi Narasimhan, Ashok, Alekh Karkada, Ahuja, Aarit, Sheinberg, David L., Serre, Thomas

arXiv.org Artificial Intelligence

The meteoric rise in the adoption of deep neural networks as computational models of vision has inspired efforts to "align" these models with humans. One dimension of interest for alignment includes behavioral choices, but moving beyond characterizing choice patterns to capturing temporal aspects of visual decision-making has been challenging. Here, we sketch a general-purpose methodology to construct computational accounts of reaction times from a stimulus-computable, task-optimized model. Specifically, we introduce a novel metric leveraging insights from subjective logic theory summarizing evidence accumulation in recurrent vision models. We demonstrate that our metric aligns with patterns of human reaction times for stimulus manipulations across four disparate visual decision-making tasks spanning perceptual grouping, mental simulation, and scene categorization. This work paves the way for exploring the temporal alignment of model and human visual strategies in the context of various other cognitive tasks toward generating testable hypotheses for neuroscience. Links to the code and data can be found on the project page: https://serre-lab.github.io/rnn_rts_site.


You Only Hear Once: A YOLO-like Algorithm for Audio Segmentation and Sound Event Detection

Venkatesh, Satvik, Moffat, David, Miranda, Eduardo Reck

arXiv.org Artificial Intelligence

Audio segmentation and sound event detection are crucial topics in machine listening that aim to detect acoustic classes and their respective boundaries. It is useful for audio-content analysis, speech recognition, audio-indexing, and music information retrieval. In recent years, most research articles adopt segmentation-by-classification. This technique divides audio into small frames and individually performs classification on these frames. In this paper, we present a novel approach called You Only Hear Once (YOHO), which is inspired by the YOLO algorithm popularly adopted in Computer Vision. We convert the detection of acoustic boundaries into a regression problem instead of frame-based classification. This is done by having separate output neurons to detect the presence of an audio class and predict its start and end points. The relative improvement for F-measure of YOHO, compared to the state-of-the-art Convolutional Recurrent Neural Network, ranged from 1% to 6% across multiple datasets for audio segmentation and sound event detection. As the output of YOHO is more end-to-end and has fewer neurons to predict, the speed of inference is at least 6 times faster than segmentation-by-classification. In addition, as this approach predicts acoustic boundaries directly, the post-processing and smoothing is about 7 times faster.