AITopics | stochastic depth

We describe Swapout, a new stochastic training method, that outperforms ResNets of identical network structure yielding impressive results on CIFAR-10 and CIFAR100.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Illinois (0.14)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)

Add feedback

Monte Carlo Stochastic Depth for Uncertainty Estimation in Deep Learning

Müller, Adam T., Rögelein, Tobias, Stache, Nicolaj C.

arXiv.org Machine LearningApr-15-2026

The deployment of deep neural networks in safety-critical systems necessitates reliable and efficient uncertainty quantification (UQ). A practical and widespread strategy for UQ is repurposing stochastic regularizers as scalable approximate Bayesian inference methods, such as Monte Carlo Dropout (MCD) and MC-DropBlock (MCDB). However, this paradigm remains under-explored for Stochastic Depth (SD), a regularizer integral to the residual-based backbones of most modern architectures. While prior work demonstrated its empirical promise for segmentation, a formal theoretical connection to Bayesian variational inference and a benchmark on complex, multi-task problems like object detection are missing. In this paper, we first provide theoretical insights connecting Monte Carlo Stochastic Depth (MCSD) to principled approximate variational inference. We then present the first comprehensive empirical benchmark of MCSD against MCD and MCDB on state-of-the-art detectors (YOLO, RT-DETR) using the COCO and COCO-O datasets. Our results position MCSD as a robust and computationally efficient method that achieves highly competitive predictive accuracy (mAP), notably yielding slight improvements in calibration (ECE) and uncertainty ranking (AUARC) compared to MCD. We thus establish MCSD as a theoretically-grounded and empirically-validated tool for efficient Bayesian approximation in modern deep learning.

artificial intelligence, machine learning, mcsd, (17 more...)

arXiv.org Machine Learning

2604.12719

Country: Europe > Germany (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)

Add feedback

37bc2f75bf1bcfe8450a1a41c200364c-Paper.pdf

Neural Information Processing SystemsMar-23-2026, 06:20:22 GMT

artificial intelligence, machine learning, residual network, (19 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

a1140a3d0df1c81e24ae954d935e8926-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 15:05:07 GMT

baseline, international conference, preln, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
(8 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Regularization in ResNet with Stochastic Depth

Neural Information Processing SystemsDec-24-2025, 09:32:24 GMT

Regularization plays a major role in modern deep learning. From classic techniques such as L1, L2 penalties to other noise-based methods such as Dropout, regularization often yields better generalization properties by avoiding overfitting. Recently, Stochastic Depth (SD) has emerged as an alternative regularization technique for residual neural networks (ResNets) and has proven to boost the performance of ResNet on many tasks [Huang et al., 2016]. Despite the recent success of SD, little is known about this technique from a theoretical perspective. This paper provides a hybrid analysis combining perturbation analysis and signal propagation to shed light on different regularization effects of SD. Our analysis allows us to derive principled guidelines for choosing the survival rates used for training with SD.

regularization, resnet, stochastic depth, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.62)

Add feedback

Mathematical Foundations of Neural Tangents and Infinite-Width Networks

Mysore, Rachana, Girish, Preksha, Jayaram, Kavitha, Kumar, Shrey, Girish, Preksha, Bagal, Shravan Sanjeev, Jayaram, Kavitha, Shastry, Shreya Aravind

arXiv.org Artificial IntelligenceDec-10-2025

We investigate the mathematical foundations of neural networks in the infinite-width regime through the Neural Tangent Kernel (NTK). We propose the NTK-Eigenvalue-Controlled Residual Network (NTK-ECRN), an architecture integrating Fourier feature embeddings, residual connections with layerwise scaling, and stochastic depth to enable rigorous analysis of kernel evolution during training. Our theoretical contributions include deriving bounds on NTK dynamics, characterizing eigenvalue evolution, and linking spectral properties to generalization and optimization stability. Empirical results on synthetic and benchmark datasets validate the predicted kernel behavior and demonstrate improved training stability and generalization. This work provides a comprehensive framework bridging infinite-width theory and practical deep-learning architectures.

artificial intelligence, evolution, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2512.08264

Country: Asia > India (0.16)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

a1140a3d0df1c81e24ae954d935e8926-Paper.pdf

Neural Information Processing SystemsAug-15-2025, 12:38:33 GMT

baseline, international conference, preln, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
(8 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Leveraging Stochastic Depth Training for Adaptive Inference

Korol, Guilherme, Beck, Antonio Carlos Schneider, Castrillon, Jeronimo

arXiv.org Artificial IntelligenceMay-26-2025

Dynamic DNN optimization techniques such as layer-skipping offer increased adaptability and efficiency gains but can lead to i) a larger memory footprint as in decision gates, ii) increased training complexity (e.g., with non-differentiable operations), and iii) less control over performance-quality trade-offs due to its inherent input-dependent execution. To approach these issues, we propose a simpler yet effective alternative for adaptive inference with a zero-overhead, single-model, and time-predictable inference. Central to our approach is the observation that models trained with Stochastic Depth -- a method for faster training of residual networks -- become more resilient to arbitrary layer-skipping at inference time. We propose a method to first select near Pareto-optimal skipping configurations from a stochastically-trained model to adapt the inference at runtime later. Compared to original ResNets, our method shows improvements of up to 2X in power efficiency at accuracy drops as low as 0.71%.

artificial intelligence, configuration, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.17626

Country:

Europe > Germany (0.28)
South America > Brazil (0.28)

Genre: Research Report (0.65)

Industry: Information Technology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Regularization in ResNet with Stochastic Depth

Neural Information Processing SystemsOct-11-2024, 13:04:15 GMT

Regularization plays a major role in modern deep learning. From classic techniques such as L1, L2 penalties to other noise-based methods such as Dropout, regularization often yields better generalization properties by avoiding overfitting. Recently, Stochastic Depth (SD) has emerged as an alternative regularization technique for residual neural networks (ResNets) and has proven to boost the performance of ResNet on many tasks [Huang et al., 2016]. Despite the recent success of SD, little is known about this technique from a theoretical perspective. This paper provides a hybrid analysis combining perturbation analysis and signal propagation to shed light on different regularization effects of SD.

regularization, resnet, stochastic depth

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

A Novel Stochastic Transformer-based Approach for Post-Traumatic Stress Disorder Detection using Audio Recording of Clinical Interviews

Dia, Mamadou, Khodabandelou, Ghazaleh, Othmani, Alice

arXiv.org Artificial IntelligenceMar-28-2024

Post-traumatic stress disorder (PTSD) is a mental disorder that can be developed after witnessing or experiencing extremely traumatic events. PTSD can affect anyone, regardless of ethnicity, or culture. An estimated one in every eleven people will experience PTSD during their lifetime. The Clinician-Administered PTSD Scale (CAPS) and the PTSD Check List for Civilians (PCL-C) interviews are gold standards in the diagnosis of PTSD. These questionnaires can be fooled by the subject's responses. This work proposes a deep learning-based approach that achieves state-of-the-art performances for PTSD detection using audio recordings during clinical interviews. Our approach is based on MFCC low-level features extracted from audio recordings of clinical interviews, followed by deep high-level learning using a Stochastic Transformer. Our proposed approach achieves state-of-the-art performances with an RMSE of 2.92 on the eDAIC dataset thanks to the stochastic depth, stochastic deep learning layers, and stochastic activation function.

activation function, ptsd, transformer, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/CBMS58004.2023.00303

2403.19441

Country: