submodel
Synergy Between the Strong and the Weak: Spiking Neural Networks are Inherently Self-Distillers
Brain-inspired spiking neural networks (SNNs) promise to be a low-power alternative to computationally intensive artificial neural networks (ANNs), although performance gaps persist. Recent studies have improved the performance of SNNs through knowledge distillation, but rely on large teacher models or introduce additional training overhead. In this paper, we show that SNNs can be naturally deconstructed into multiple submodels for efficient self-distillation. We treat each timestep instance of the SNN as a submodel and evaluate its output confidence, thus efficiently identifying the strong and the weak. Based on this strong and weak relationship, we propose two efficient self-distillation schemes: (1) Strong2Weak: During training, the stronger "teacher" guides the weaker "student", effectively improving overall performance.
Optimal Adjustment Sets for Nonparametric Estimation of Weighted Controlled Direct Effect
The weighted controlled direct effect (WCDE) generalizes the standard controlled direct effect (CDE) by averaging over the mediator distribution, providing a robust estimate when treatment effects vary across mediator levels. This makes the WCDE especially relevant in fairness analysis, where it isolates the direct effect of an exposure on an outcome, independent of mediating pathways. This work establishes three fundamental advances for WCDE in observational studies: First, we establish necessary and sufficient conditions for the identifiability of the WCDE, clarifying when it diverges from the CDE. Next, we consider nonparametric estimation of the WCDE and derive its influence function, focusing on the class of regular and asymptotically linear estimators. Lastly, we characterize the optimal covariate adjustment set that minimizes the asymptotic variance, demonstrating how mediator-confounder interactions introduce distinct requirements compared to average treatment effect (ATE) estimation. Using synthetic and real-world data, we validate our theory numerically, showing that the proposed optimal valid adjustment set yields the lowest variance at practical sample sizes. Our results offer a principled framework for efficient estimation of direct effects in complex causal systems, with practical applications in fairness and mediation analysis.
PINN Balls: Scaling Second-Order Methods for PINNs with Domain Decomposition and Adaptive Sampling
Recent advances in Scientific Machine Learning have shown that second-order methods can enhance the training of Physics-Informed Neural Networks (PINNs), making them a suitable alternative to traditional numerical methods for Partial Differential Equations (PDEs). However, second-order methods induce large memory requirements, making them scale poorly with the model size. In this paper, we define a local Mixture of Experts (MoE) combining the parameter-efficiency of ensemble models and sparse coding to enable the use of second-order training. Our model - PINNBALLS - also features a fully learnable domain decomposition structure, achieved through the use of Adversarial Adaptive Sampling (AAS), which adapts the DD to the PDE and its domain. PINNBALLS achieves better accuracy than the state-of-the-art in scientific machine learning, while maintaining invaluable scalability properties and drawing from a sound theoretical background.
Quantifying and Attributing Submodel Uncertainty in Stochastic Simulation Models and Digital Twins
Ghasemloo, Mohammadmahdi, Eckman, David J., Li, Yaxian
Stochastic simulation is widely used to study complex systems composed of various interconnected subprocesses, such as input processes, routing and control logic, optimization routines, and data-driven decision modules. In practice, these subprocesses may be inherently unknown or too computationally intensive to directly embed in the simulation model. Replacing these elements with estimated or learned approximations introduces a form of epistemic uncertainty that we refer to as submodel uncertainty. This paper investigates how submodel uncertainty affects the estimation of system performance metrics. We develop a framework for quantifying submodel uncertainty in stochastic simulation models and extend the framework to digital-twin settings, where simulation experiments are repeatedly conducted with the model initialized from observed system states. Building on approaches from input uncertainty analysis, we leverage bootstrapping and Bayesian model averaging to construct quantile-based confidence or credible intervals for key performance indicators. We propose a tree-based method that decomposes total output variability and attributes uncertainty to individual submodels in the form of importance scores. The proposed framework is model-agnostic and accommodates both parametric and nonparametric submodels under frequentist and Bayesian modeling paradigms. A synthetic numerical experiment and a more realistic digital-twin simulation of a contact center illustrate the importance of understanding how and how much individual submodels contribute to overall uncertainty.
Efficient Meta Neural Heuristic for Multi-Objective Combinatorial Optimization (Appendix) A Model architecture The architecture of the base model in meta-learning is the same as POMO [ 26
Each sublayer adds a skip-connection (ADD) and batch normalization (BN). The decoder sequentially chooses a node according to a probability distribution produced by the node embeddings to construct a solution. The scaled symmetric sampling method is shown in Algorithm 2. The scaled factor The uniform division of the weight space is illustrated as follows. Thus, its approximate Pareto optimal solutions are commonly pursued. V ehicles must serve all the customers and finally return to the depot.