AITopics | rnn architecture

Collaborating Authors

rnn architecture

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Universality and individuality in neural dynamics across large populations of recurrent networks

Niru Maheswaranathan, Alex Williams, Matthew Golub, Surya Ganguli, David Sussillo

Neural Information Processing SystemsFeb-12-2026, 08:23:43 GMT

Or alternatively, are the learned dynamics highly sensitiveto different model choices?

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.05)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(3 more...)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Universality and individuality in neural dynamics across large populations of recurrent networks

Niru Maheswaranathan, Alex Williams, Matthew Golub, Surya Ganguli, David Sussillo

Neural Information Processing SystemsOct-2-2025, 20:16:24 GMT

Or alternatively, are the learned dynamics highly sensitive to different model choices?

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)
(7 more...)

Genre: Research Report (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Making deep neural networks work for medical audio: representation, compression and domain adaptation

Onu, Charles C

arXiv.org Artificial IntelligenceJun-18-2025

This thesis addresses the technical challenges of applying machine learning to understand and interpret medical audio signals. The sounds of our lungs, heart, and voice convey vital information about our health. Yet, in contemporary medicine, these sounds are primarily analyzed through auditory interpretation by experts using devices like stethoscopes. Automated analysis offers the potential to standardize the processing of medical sounds, enable screening in low-resource settings where physicians are scarce, and detect subtle patterns that may elude human perception, thereby facilitating early diagnosis and treatment. Focusing on the analysis of infant cry sounds to predict medical conditions, this thesis contributes on four key fronts. First, in low-data settings, we demonstrate that large databases of adult speech can be harnessed through neural transfer learning to develop more accurate and robust models for infant cry analysis. Second, in cost-effective modeling, we introduce an end-to-end model compression approach for recurrent networks using tensor decomposition. Our method requires no post-hoc processing, achieves compression rates of several hundred-fold, and delivers accurate, portable models suitable for resource-constrained devices. Third, we propose novel domain adaptation techniques tailored for audio models and adapt existing methods from computer vision. These approaches address dataset bias and enhance generalization across domains while maintaining strong performance on the original data. Finally, to advance research in this domain, we release a unique, open-source dataset of infant cry sounds, developed in collaboration with clinicians worldwide. This work lays the foundation for recognizing the infant cry as a vital sign and highlights the transformative potential of AI-driven audio monitoring in shaping the future of accessible and affordable healthcare.

artificial intelligence, machine learning, survey article, (17 more...)

arXiv.org Artificial Intelligence

2506.1397

Country:

North America > Canada > Quebec > Montreal (0.50)
Europe > Switzerland > Zürich > Zürich (0.13)
Europe > Russia > Northwestern Federal District > Leningrad Oblast > Saint Petersburg (0.04)
(14 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pediatrics/Neonatology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Lattice Protein Folding with Variational Annealing

Khandoker, Shoummo Ahsan, Inack, Estelle M., Hibat-Allah, Mohamed

arXiv.org Artificial IntelligenceFeb-27-2025

Understanding the principles of protein folding is a cornerstone of computational biology, with implications for drug design, bioengineering, and the understanding of fundamental biological processes. Lattice protein folding models offer a simplified yet powerful framework for studying the complexities of protein folding, enabling the exploration of energetically optimal folds under constrained conditions. However, finding these optimal folds is a computationally challenging combinatorial optimization problem. In this work, we introduce a novel upper-bound training scheme that employs masking to identify the lowest-energy folds in two-dimensional Hydrophobic-Polar (HP) lattice protein folding. By leveraging Dilated Recurrent Neural Networks (RNNs) integrated with an annealing process driven by temperature-like fluctuations, our method accurately predicts optimal folds for benchmark systems of up to 60 beads. Our approach also effectively masks invalid folds from being sampled without compromising the autoregressive sampling properties of RNNs. This scheme is generalizable to three spatial dimensions and can be extended to lattice protein models with larger alphabets. Our findings emphasize the potential of advanced machine learning techniques in tackling complex protein folding problems and a broader class of constrained combinatorial optimization challenges.

probability, protein, sequence, (15 more...)

arXiv.org Artificial Intelligence

2502.20632

Country:

North America > Canada > Ontario (0.05)
North America > United States > Indiana > Monroe County > Bloomington (0.04)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.88)

Add feedback

Memory Capacity of Nonlinear Recurrent Networks: Is it Informative?

Ballarin, Giovanni, Grigoryeva, Lyudmila, Ortega, Juan-Pablo

arXiv.org Machine LearningFeb-7-2025

Memory capacity of nonlinear recurrent networks: Is it informative? Abstract The total memory capacity (MC) of linear recurrent neural networks (RNNs) has been proven to be equal to the rank of the corresponding Kalman controllability matrix, and it is almost surely maximal for connectivity and input weight matrices drawn from regular distributions. This fact questions the usefulness of this metric in distinguishing the performance of linear RNNs in the processing of stochastic signals. This note shows that the MC of random nonlinear RNNs yields arbitrary values within established upper and lower bounds depending just on the input process scale. This confirms that the existing definition of MC in linear and nonlinear cases has no practical value.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Machine Learning

2502.04832

Country:

Asia > Singapore (0.04)
Europe > United Kingdom (0.04)
Europe > Switzerland > St. Gallen > St. Gallen (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

BabyHGRN: Exploring RNNs for Sample-Efficient Training of Language Models

Haller, Patrick, Golde, Jonas, Akbik, Alan

arXiv.org Artificial IntelligenceDec-20-2024

This paper explores the potential of recurrent neural networks (RNNs) and other subquadratic architectures as competitive alternatives to transformer-based models in low-resource language modeling scenarios. We utilize HGRN2 (Qin et al., 2024), a recently proposed RNN-based architecture, and comparatively evaluate its effectiveness against transformer-based baselines and other subquadratic architectures (LSTM, xLSTM, Mamba). Our experimental results show that BABYHGRN, our HGRN2 language model, outperforms transformer-based models in both the 10M and 100M word tracks of the challenge, as measured by their performance on the BLiMP, EWoK, GLUE and BEAR benchmarks. Further, we show the positive impact of knowledge distillation. Our findings challenge the prevailing focus on transformer architectures and indicate the viability of RNN-based models, particularly in resource-constrained environments.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.15978

Country:

Asia > Singapore (0.05)
North America > Mexico (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Road User Classification from High-Frequency GNSS Data Using Distributed Edge Intelligence

Köpper, Lennart, Wieland, Thomas

arXiv.org Artificial IntelligenceNov-28-2024

Real-world traffic involves diverse road users, ranging from pedestrians to heavy trucks, necessitating effective road user classification for various applications within Intelligent Transport Systems (ITS). Traditional approaches often rely on intrusive and/or expensive external hardware sensors. These systems typically have limited spatial coverage. In response to these limitations, this work aims to investigate an unintrusive and cost-effective alternative for road user classification by using high-frequency (1-2 Hz) positional sequences. A cutting-edge solution could involve leveraging positioning data from 5G networks. However, this feature is currently only proposed in the 3GPP standard and has not yet been implemented for outdoor applications by 5G equipment vendors. Therefore, our approach relies on positional data, that is recorded under real-world conditions using Global Navigation Satellite Systems (GNSS) and processed on distributed edge devices. As a start-ing point, four types of road users are distinguished: pedestri-ans, cyclists, motorcycles, and passenger cars. While earlier approaches used classical statistical methods, we propose Long Short-Term Memory (LSTM) recurrent neural networks (RNNs) as the preferred classification method, as they repre-sent state-of-the-art in processing sequential data. An RNN architecture for road user classification, based on selected motion characteristics derived from raw positional sequences is presented and the influence of sequence length on classifica-tion quality is examined. The results of the work show that RNNs are capable of efficiently classifying road users on dis-tributed devices, and can particularly differentiate between types of motorized vehicles, based on two- to four-minute se-quences.

classification, sequence, trajectory, (15 more...)

arXiv.org Artificial Intelligence

2412.00132

Country:

Europe > Germany (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Maryland (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Transportation > Ground > Road (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reviews: Deepcode: Feedback Codes via Deep Learning

Neural Information Processing SystemsOct-7-2024, 07:51:28 GMT

The formal noisy channel setting is similar to a standard autoencoder framework, with a few key differences. For one, we usually encode and transmit one bit of a message at time due to channel limits, and second we get feedback, usually in the form of a noisy version of each encoded bit. Due to the sequential nature of the problem, plus the availability of feedback, the authors apply an RNN architecture. The input to the decoder at each step is the next bit to encode plus an estimate of the noise from previous steps (derived from the difference between the encoded message and the received feedback). Experiments suggest that this approach significantly outperforms existing approaches.

architecture, noisy channel, rnn architecture, (12 more...)

Neural Information Processing Systems

Genre: Research Report (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

Multi-Objective Evolutionary Neural Architecture Search for Recurrent Neural Networks

Booysen, Reinhard, Bosman, Anna Sergeevna

arXiv.org Artificial IntelligenceMar-17-2024

Artificial neural network (NN) architecture design is a nontrivial and time-consuming task that often requires a high level of human expertise. Neural architecture search (NAS) serves to automate the design of NN architectures and has proven to be successful in automatically finding NN architectures that outperform those manually designed by human experts. NN architecture performance can be quantified based on multiple objectives, which include model accuracy and some NN architecture complexity objectives, among others. The majority of modern NAS methods that consider multiple objectives for NN architecture performance evaluation are concerned with automated feed forward NN architecture design, which leaves multi-objective automated recurrent neural network (RNN) architecture design unexplored. RNNs are important for modeling sequential datasets, and prominent within the natural language processing domain. It is often the case in real world implementations of machine learning and NNs that a reasonable trade-off is accepted for marginally reduced model accuracy in favour of lower computational resources demanded by the model. This paper proposes a multi-objective evolutionary algorithm-based RNN architecture search method. The proposed method relies on approximate network morphisms for RNN architecture complexity optimisation during evolution. The results show that the proposed method is capable of finding novel RNN architectures with comparable performance to state-of-the-art manually designed RNN architectures, but with reduced computational demand.

architecture, moe rna algorithm, rnn architecture, (12 more...)

arXiv.org Artificial Intelligence

2403.11173

Country:

Africa > South Africa > Gauteng > Pretoria (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(5 more...)

Genre:

Research Report (0.70)
Overview (0.68)

Industry: Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

From Tensor Network Quantum States to Tensorial Recurrent Neural Networks

Wu, Dian, Rossi, Riccardo, Vicentini, Filippo, Carleo, Giuseppe

arXiv.org Machine LearningMar-8-2023

Considering the relation between neural networks (NN) and TN, the first works focused on the restricted Boltzmann machines (RBM), which are one of the simplest Tensor networks (TN) have been extensively used to classes of NN. It is impossible to efficiently map an represent the states of quantum many-body physical systems RBM onto a TN, as they correspond to string-bond states [1-3]. Matrix product states (MPS) are possibly with an arbitrary nonlocal geometry [28]. This result was the simplest family of TN, and are suitable to capture later refined to show that an RBM may correspond to an the ground state of 1D gapped Hamiltonians [4, 5]. They MPS with an exponentially large bond dimension, and can be contracted in polynomial time to compute physical only short-range RBM can be mapped onto efficiently quantities exactly, and optimized by density matrix computable entangled plaquette states [31]. Similar results renormalization group (DMRG) [6] when used as variational have been obtained that deep Boltzmann machines ansätze. More powerful TN architectures that with proper constraints can be mapped onto TN that cannot be efficiently contracted in general have been are efficiently computable through transfer matrix methods proposed later, notably projected entangled pair states [32].

artificial intelligence, machine learning, mp-rnn, (18 more...)

arXiv.org Machine Learning

doi: 10.1103/PhysRevResearch.5.L032001

2206.12363

Country:

Europe > Switzerland > Vaud > Lausanne (0.04)
Asia > Japan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.54)

Add feedback