rnn architecture
- North America > United States > California > Santa Clara County > Palo Alto (0.05)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- (3 more...)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > California > Santa Clara County > Mountain View (0.04)
- (7 more...)
Making deep neural networks work for medical audio: representation, compression and domain adaptation
This thesis addresses the technical challenges of applying machine learning to understand and interpret medical audio signals. The sounds of our lungs, heart, and voice convey vital information about our health. Yet, in contemporary medicine, these sounds are primarily analyzed through auditory interpretation by experts using devices like stethoscopes. Automated analysis offers the potential to standardize the processing of medical sounds, enable screening in low-resource settings where physicians are scarce, and detect subtle patterns that may elude human perception, thereby facilitating early diagnosis and treatment. Focusing on the analysis of infant cry sounds to predict medical conditions, this thesis contributes on four key fronts. First, in low-data settings, we demonstrate that large databases of adult speech can be harnessed through neural transfer learning to develop more accurate and robust models for infant cry analysis. Second, in cost-effective modeling, we introduce an end-to-end model compression approach for recurrent networks using tensor decomposition. Our method requires no post-hoc processing, achieves compression rates of several hundred-fold, and delivers accurate, portable models suitable for resource-constrained devices. Third, we propose novel domain adaptation techniques tailored for audio models and adapt existing methods from computer vision. These approaches address dataset bias and enhance generalization across domains while maintaining strong performance on the original data. Finally, to advance research in this domain, we release a unique, open-source dataset of infant cry sounds, developed in collaboration with clinicians worldwide. This work lays the foundation for recognizing the infant cry as a vital sign and highlights the transformative potential of AI-driven audio monitoring in shaping the future of accessible and affordable healthcare.
- North America > Canada > Quebec > Montreal (0.50)
- Europe > Switzerland > Zürich > Zürich (0.13)
- Europe > Russia > Northwestern Federal District > Leningrad Oblast > Saint Petersburg (0.04)
- (14 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Health & Medicine > Therapeutic Area > Pediatrics/Neonatology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (0.92)
- (2 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Lattice Protein Folding with Variational Annealing
Khandoker, Shoummo Ahsan, Inack, Estelle M., Hibat-Allah, Mohamed
Understanding the principles of protein folding is a cornerstone of computational biology, with implications for drug design, bioengineering, and the understanding of fundamental biological processes. Lattice protein folding models offer a simplified yet powerful framework for studying the complexities of protein folding, enabling the exploration of energetically optimal folds under constrained conditions. However, finding these optimal folds is a computationally challenging combinatorial optimization problem. In this work, we introduce a novel upper-bound training scheme that employs masking to identify the lowest-energy folds in two-dimensional Hydrophobic-Polar (HP) lattice protein folding. By leveraging Dilated Recurrent Neural Networks (RNNs) integrated with an annealing process driven by temperature-like fluctuations, our method accurately predicts optimal folds for benchmark systems of up to 60 beads. Our approach also effectively masks invalid folds from being sampled without compromising the autoregressive sampling properties of RNNs. This scheme is generalizable to three spatial dimensions and can be extended to lattice protein models with larger alphabets. Our findings emphasize the potential of advanced machine learning techniques in tackling complex protein folding problems and a broader class of constrained combinatorial optimization challenges.
- North America > Canada > Ontario (0.05)
- North America > United States > Indiana > Monroe County > Bloomington (0.04)
Memory Capacity of Nonlinear Recurrent Networks: Is it Informative?
Ballarin, Giovanni, Grigoryeva, Lyudmila, Ortega, Juan-Pablo
Memory capacity of nonlinear recurrent networks: Is it informative? Abstract The total memory capacity (MC) of linear recurrent neural networks (RNNs) has been proven to be equal to the rank of the corresponding Kalman controllability matrix, and it is almost surely maximal for connectivity and input weight matrices drawn from regular distributions. This fact questions the usefulness of this metric in distinguishing the performance of linear RNNs in the processing of stochastic signals. This note shows that the MC of random nonlinear RNNs yields arbitrary values within established upper and lower bounds depending just on the input process scale. This confirms that the existing definition of MC in linear and nonlinear cases has no practical value.
- Asia > Singapore (0.04)
- Europe > United Kingdom (0.04)
- Europe > Switzerland > St. Gallen > St. Gallen (0.04)
BabyHGRN: Exploring RNNs for Sample-Efficient Training of Language Models
Haller, Patrick, Golde, Jonas, Akbik, Alan
This paper explores the potential of recurrent neural networks (RNNs) and other subquadratic architectures as competitive alternatives to transformer-based models in low-resource language modeling scenarios. We utilize HGRN2 (Qin et al., 2024), a recently proposed RNN-based architecture, and comparatively evaluate its effectiveness against transformer-based baselines and other subquadratic architectures (LSTM, xLSTM, Mamba). Our experimental results show that BABYHGRN, our HGRN2 language model, outperforms transformer-based models in both the 10M and 100M word tracks of the challenge, as measured by their performance on the BLiMP, EWoK, GLUE and BEAR benchmarks. Further, we show the positive impact of knowledge distillation. Our findings challenge the prevailing focus on transformer architectures and indicate the viability of RNN-based models, particularly in resource-constrained environments.
- Asia > Singapore (0.05)
- North America > Mexico (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (6 more...)
Road User Classification from High-Frequency GNSS Data Using Distributed Edge Intelligence
Köpper, Lennart, Wieland, Thomas
Real-world traffic involves diverse road users, ranging from pedestrians to heavy trucks, necessitating effective road user classification for various applications within Intelligent Transport Systems (ITS). Traditional approaches often rely on intrusive and/or expensive external hardware sensors. These systems typically have limited spatial coverage. In response to these limitations, this work aims to investigate an unintrusive and cost-effective alternative for road user classification by using high-frequency (1-2 Hz) positional sequences. A cutting-edge solution could involve leveraging positioning data from 5G networks. However, this feature is currently only proposed in the 3GPP standard and has not yet been implemented for outdoor applications by 5G equipment vendors. Therefore, our approach relies on positional data, that is recorded under real-world conditions using Global Navigation Satellite Systems (GNSS) and processed on distributed edge devices. As a start-ing point, four types of road users are distinguished: pedestri-ans, cyclists, motorcycles, and passenger cars. While earlier approaches used classical statistical methods, we propose Long Short-Term Memory (LSTM) recurrent neural networks (RNNs) as the preferred classification method, as they repre-sent state-of-the-art in processing sequential data. An RNN architecture for road user classification, based on selected motion characteristics derived from raw positional sequences is presented and the influence of sequence length on classifica-tion quality is examined. The results of the work show that RNNs are capable of efficiently classifying road users on dis-tributed devices, and can particularly differentiate between types of motorized vehicles, based on two- to four-minute se-quences.
- Europe > Germany (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Maryland (0.04)
- (2 more...)
Reviews: Deepcode: Feedback Codes via Deep Learning
The formal noisy channel setting is similar to a standard autoencoder framework, with a few key differences. For one, we usually encode and transmit one bit of a message at time due to channel limits, and second we get feedback, usually in the form of a noisy version of each encoded bit. Due to the sequential nature of the problem, plus the availability of feedback, the authors apply an RNN architecture. The input to the decoder at each step is the next bit to encode plus an estimate of the noise from previous steps (derived from the difference between the encoded message and the received feedback). Experiments suggest that this approach significantly outperforms existing approaches.
Multi-Objective Evolutionary Neural Architecture Search for Recurrent Neural Networks
Booysen, Reinhard, Bosman, Anna Sergeevna
Artificial neural network (NN) architecture design is a nontrivial and time-consuming task that often requires a high level of human expertise. Neural architecture search (NAS) serves to automate the design of NN architectures and has proven to be successful in automatically finding NN architectures that outperform those manually designed by human experts. NN architecture performance can be quantified based on multiple objectives, which include model accuracy and some NN architecture complexity objectives, among others. The majority of modern NAS methods that consider multiple objectives for NN architecture performance evaluation are concerned with automated feed forward NN architecture design, which leaves multi-objective automated recurrent neural network (RNN) architecture design unexplored. RNNs are important for modeling sequential datasets, and prominent within the natural language processing domain. It is often the case in real world implementations of machine learning and NNs that a reasonable trade-off is accepted for marginally reduced model accuracy in favour of lower computational resources demanded by the model. This paper proposes a multi-objective evolutionary algorithm-based RNN architecture search method. The proposed method relies on approximate network morphisms for RNN architecture complexity optimisation during evolution. The results show that the proposed method is capable of finding novel RNN architectures with comparable performance to state-of-the-art manually designed RNN architectures, but with reduced computational demand.
- Africa > South Africa > Gauteng > Pretoria (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (5 more...)
- Research Report (0.70)
- Overview (0.68)
From Tensor Network Quantum States to Tensorial Recurrent Neural Networks
Wu, Dian, Rossi, Riccardo, Vicentini, Filippo, Carleo, Giuseppe
Considering the relation between neural networks (NN) and TN, the first works focused on the restricted Boltzmann machines (RBM), which are one of the simplest Tensor networks (TN) have been extensively used to classes of NN. It is impossible to efficiently map an represent the states of quantum many-body physical systems RBM onto a TN, as they correspond to string-bond states [1-3]. Matrix product states (MPS) are possibly with an arbitrary nonlocal geometry [28]. This result was the simplest family of TN, and are suitable to capture later refined to show that an RBM may correspond to an the ground state of 1D gapped Hamiltonians [4, 5]. They MPS with an exponentially large bond dimension, and can be contracted in polynomial time to compute physical only short-range RBM can be mapped onto efficiently quantities exactly, and optimized by density matrix computable entangled plaquette states [31]. Similar results renormalization group (DMRG) [6] when used as variational have been obtained that deep Boltzmann machines ansätze. More powerful TN architectures that with proper constraints can be mapped onto TN that cannot be efficiently contracted in general have been are efficiently computable through transfer matrix methods proposed later, notably projected entangled pair states [32].