dann
Learning better with Dale's Law: ASpectral Perspective
Most recurrent neural networks (RNNs) do not include a fundamental constraint of real neural circuits: Dale's Law, which implies that neurons must be excitatory (E) or inhibitory (I). Dale's Law is generally absent from RNNs because simply partitioning a standard network's units into E and I populations impairs learning. However, here we extend a recent feedforward bio-inspired EI network architecture, named Dale's ANNs, to recurrent networks, and demonstrate that good performance is possible while respecting Dale's Law. This begs the question: What makes some forms of EI network learn poorly and others learn well? And, why does the simple approach of incorporating Dale's Law impair learning? Historically the answer was thought to be the sign constraints on EI network parameters, and this was a motivation behind Dale's ANNs. However, here we show the spectral properties of the recurrent weight matrix at initialisation are more impactful on network performance than sign constraints. We find that simple EI partitioning results in a singular value distribution that is multimodal and dispersed, whereas standard RNNs have an unimodal, more clustered singular value distribution, as do recurrent Dale's ANNs. We also show that the spectral properties and performance of partitioned EI networks are worse for small networks with fewer I units, and we present normalised SVD entropy as a measure of spectrum pathology that correlates with performance.
Deep Alternative Neural Network: Exploring Contexts as Early as Possible for Action Recognition
Jinzhuo Wang, Wenmin Wang, xiongtao Chen, Ronggang Wang, Wen Gao
Contexts are crucial for action recognition in video. Current methods often mine contexts after extracting hierarchical local features and focus on their high-order encodings. This paper instead explores contexts as early as possible and leverages their evolutions for action recognition. In particular, we introduce a novel architecture called deep alternative neural network (DANN) stacking alternative layers. Each alternative layer consists of a volumetric convolutional layer followed by a recurrent layer.
Deep Alternative Neural Network: Exploring Contexts as Early as Possible for Action Recognition
Contexts are crucial for action recognition in video. Current methods often mine contexts after extracting hierarchical local features and focus on their high-order encodings. This paper instead explores contexts as early as possible and leverages their evolutions for action recognition. In particular, we introduce a novel architecture called deep alternative neural network (DANN) stacking alternative layers. Each alternative layer consists of a volumetric convolutional layer followed by a recurrent layer.
FedDAPL: Toward Client-Private Generalization in Federated Learning
Loaliyan, Soroosh Safari, Ambite, Jose-Luis, Thompson, Paul M., Jahanshad, Neda, Steeg, Greg Ver
Federated Learning (FL) trains models locally at each research center or clinic and aggregates only model updates, making it a natural fit for medical imaging, where strict privacy laws forbid raw data sharing. A major obstacle is scanner-induced domain shift: non-biological variations in hardware or acquisition protocols can cause models to fail on external sites. Most harmonization methods correct this shift by directly comparing data across sites, conflicting with FL's privacy constraints. Domain Generalization (DG) offers a privacy-friendly alternative - learning site-invariant representations without sharing raw data - but standard DG pipelines still assume centralized access to multi-site data, again violating FL's guarantees. This paper meets these difficulties with a straightforward integration of a Domain-Adversarial Neural Network (DANN) within the FL process. After demonstrating that a naive federated DANN fails to converge, we propose a proximal regularization method that stabilizes adversarial training among clients. Experiments on T1-weighted 3-D brain MRIs from the OpenBHB dataset, performing brain-age prediction on participants aged 6-64 y (mean 22+/-6 y; 45 percent male) in training and 6-79 y (mean 19+/-13 y; 55 percent male) in validation, show that training on 15 sites and testing on 19 unseen sites yields superior cross-site generalization over FedAvg and ERM while preserving data privacy.
A domain adaptation neural network for digital twin-supported fault diagnosis
Chen, Zhenling, Fu, Haiwei, Zeng, Zhiguo
--Digital twins offer a promising solution to the lack of sufficient labeled data in deep learning-based fault diagnosis by generating simulated data for model training. However, discrepancies between simulation and real-world systems can lead to a significant drop in performance when models are applied in real scenarios. T o address this issue, we propose a fault diagnosis framework based on Domain-Adversarial Neural Networks (DANN), which enables knowledge transfer from simulated (source domain) to real-world (target domain) data. We evaluate the proposed framework using a publicly available robotics fault diagnosis dataset, which includes 3,600 sequences generated by a digital twin model and 90 real sequences collected from physical systems. The DANN method is compared with commonly used lightweight deep learning models such as CNN, TCN, Transformer, and LSTM. Experimental results show that incorporating domain adaptation significantly improves the diagnostic performance. For example, applying DANN to a baseline CNN model improves its accuracy from 70.00% to 80.22% on real-world test data, demonstrating the effectiveness of domain adaptation in bridging the sim-to-real gap.