Goto

Collaborating Authors

 Bayesian Learning


RTNinja: A generalized machine learning framework for analyzing random telegraph noise signals in nanoelectronic devices

arXiv.org Artificial Intelligence

Random telegraph noise is a prevalent variability phenomenon in nanoelectronic devices, arising from stochastic carrier exchange at defect sites and critically impacting device reliability and performance. Conventional analysis techniques often rely on restrictive assumptions or manual interventions, limiting their applicability to complex, noisy datasets. Here, we introduce RTNinja, a generalized, fully automated machine learning framework for the unsupervised analysis of random telegraph noise signals. RTNinja deconvolves complex signals to identify the number and characteristics of hidden individual sources without requiring prior knowledge of the system. The framework comprises two modular components: LevelsExtractor, which uses Bayesian inference and model selection to denoise and discretize the signal, and SourcesMapper, which infers source configurations through probabilistic clustering and optimization. To evaluate performance, we developed a Monte Carlo simulator that generates labeled datasets spanning broad signal-to-noise ratios and source complexities; across 7000 such datasets, RTNinja consistently demonstrated high-fidelity signal reconstruction and accurate extraction of source amplitudes and activity patterns. Our results demonstrate that RTNinja offers a robust, scalable, and device-agnostic tool for random telegraph noise characterization, enabling large-scale statistical benchmarking, reliability-centric technology qualification, predictive failure modeling, and device physics exploration in next-generation nanoelectronics.


How to Bridge the Sim-to-Real Gap in Digital Twin-Aided Telecommunication Networks

arXiv.org Artificial Intelligence

Abstract--Training effective artificial intelligence models for telecommunications is challenging due to the scarcity of deployment-specific data. Real data collection is expensive, and available datasets often fail to capture the unique operational conditions and contextual variability of the network environment. Digital twinning provides a potential solution to this problem, as simulators tailored to the current network deployment can generate site-specific data to augment the available training datasets. However, there is a need to develop solutions to bridge the inherent simulation-to-reality (sim-to-real) gap between synthetic and real-world data. This paper reviews recent advances on two complementary strategies: 1) the calibration of digital twins (DTs) through real-world measurements, and 2) the use of sim-to-real gap-aware training strategies to robustly handle residual discrepancies between digital twin-generated and real data. For the latter, we evaluate two conceptually distinct methods that model the sim-to-real gap either at the level of the environment via Bayesian learning or at the level of the training loss via prediction-powered inference. Driven by the continued growth of computing resources and training datasets, artificial intelligence (AI) research is widely considered to be in the scaling era, which is focused on the development of general-purpose models that exhibit emergent capabilities. While this trend has yielded impressive results for many tasks, particularly in the domain of language modeling, it poses unique challenges when applied to engineering domains such as telecommunication networks.


Multiscale guidance of protein structure prediction with heterogeneous cryo-EM data

arXiv.org Artificial Intelligence

Protein structure prediction models are now capable of generating accurate 3D structural hypotheses from sequence alone. However, they routinely fail to capture the conformational diversity of dynamic biomolecular complexes, often requiring heuristic MSA subsampling approaches for generating alternative states. In parallel, cryo-electron microscopy (cryo-EM) has emerged as a powerful tool for imaging near-native structural heterogeneity, but is challenged by arduous pipelines to transform raw experimental data into atomic models. Here, we bridge the gap between these modalities, combining cryo-EM density maps with the rich sequence and biophysical priors learned by protein structure prediction models. Our method, CryoBoltz, guides the sampling trajectory of a pretrained biomolecular structure prediction model using both global and local structural constraints derived from density maps, driving predictions towards conformational states consistent with the experimental data. We demonstrate that this flexible yet powerful inference-time approach allows us to build atomic models into heterogeneous cryo-EM maps across a variety of dynamic biomolecular systems including transporters and antibodies. Code is available at https://github.com/ml-struct-bio/cryoboltz .


The Unified Cognitive Consciousness Theory for Language Models: Anchoring Semantics, Thresholds of Activation, and Emergent Reasoning

arXiv.org Artificial Intelligence

We propose semantic anchoring, a unified account of how large language models turn pretrained capacity into goal-directed behavior: external structure (in-context examples, retrieval, or light tuning) binds the model's latent patterns to desired targets. Unified Contextual Control Theory (UCCT) formalizes this via anchoring strength $S = ρ_d - d_r - \log k$, where $ρ_d$ measures target cohesion in representation space, $d_r$ measures mismatch from prior knowledge, and $k$ is the anchor budget. UCCT predicts threshold-like performance flips and strictly generalizes in-context learning, reading retrieval and fine-tuning as anchoring variants. Three controlled studies provide evidence. Experiment 1 demonstrates cross-domain anchoring rebinding strong priors in text and vision. Experiment 2 varies representational familiarity via numeral bases (base-10/8/9) at fixed complexity, yielding ordered thresholds and transfer patterns tracking $ρ_d$, $d_r$, and $S$. Experiment 3 establishes a geometry-to-behavior correlate: layer-wise peak anchoring and trajectory area predict few-shot thresholds $θ_{50}$. UCCT offers testable theory and practical metrics for optimizing prompts, retrieval, and tuning.


Accelerated Execution of Bayesian Neural Networks using a Single Probabilistic Forward Pass and Code Generation

arXiv.org Machine Learning

Machine learning models perform well across domains such as diagnostics, weather forecasting, NLP, and autonomous driving, but their limited uncertainty handling restricts use in safety-critical settings. Traditional neural networks often fail to detect out-of-domain (OOD) data and may output confident yet incorrect predictions. Bayesian neural networks (BNNs) address this by providing probabilistic estimates, but incur high computational cost because predictions require sampling weight distributions and multiple forward passes. The Probabilistic Forward Pass (PFP) offers a highly efficient approximation to Stochastic Variational Inference (SVI) by assuming Gaussian-distributed weights and activations, enabling fully analytic uncertainty propagation and replacing sampling with a single deterministic forward pass. We present an end-to-end pipeline for training, compiling, optimizing, and deploying PFP-based BNNs on embedded ARM CPUs. Using the TVM deep learning compiler, we implement a dedicated library of Gaussian-propagating operators for multilayer perceptrons and convolutional neural networks, combined with manual and automated tuning strategies. Ablation studies show that PFP consistently outperforms SVI in computational efficiency, achieving speedups of up to 4200x for small mini-batches. PFP-BNNs match SVI-BNNs on Dirty-MNIST in accuracy, uncertainty estimation, and OOD detection while greatly reducing compute cost. These results highlight the potential of combining Bayesian approximations with code generation to enable efficient BNN deployment on resource-constrained systems.


Data-driven informative priors for Bayesian inference with quasi-periodic data

arXiv.org Machine Learning

Bayesian computational strategies for inference can be inefficient in approximating the posterior distribution in models that exhibit some form of periodicity. This is because the probability mass of the marginal posterior distribution of the parameter representing the period is usually highly concentrated in a very small region of the parameter space. Therefore, it is necessary to provide as much information as possible to the inference method through the parameter prior distribution. We intend to show that it is possible to construct a prior distribution from the data by fitting a Gaussian process (GP) with a periodic kernel. More specifically, we want to show that it is possible to approximate the marginal posterior distribution of the hyperparameter corresponding to the period in the kernel. Subsequently, this distribution can be used as a prior distribution for the inference method. We use an adaptive importance sampling method to approximate the posterior distribution of the hyperparameters of the GP. Then, we use the marginal posterior distribution of the hyperparameter related to the periodicity in order to construct a prior distribution for the period of the parametric model. This workflow is empirical Bayes, implemented as a modular (cut) transfer of a GP posterior for the period to the parametric model. We applied the proposed methodology to both synthetic and real data. We approximated the posterior distribution of the period of the GP kernel and then passed it forward as a posterior-as-prior with no feedback. Finally, we analyzed its impact on the marginal posterior distribution.


Support Vector Machine Classifier with Rescaled Huberized Pinball Loss

arXiv.org Machine Learning

Support vector machines are widely used in machine learning classification tasks, but traditional SVM models suffer from sensitivity to outliers and instability in resampling, which limits their performance in practical applications. To address these issues, this paper proposes a novel rescaled Huberized pinball loss function with asymmetric, non-convex, and smooth properties. Based on this loss function, we develop a corresponding SVM model called RHPSVM (Rescaled Huberized Pinball Loss Support Vector Machine). Theoretical analyses demonstrate that RHPSVM conforms to Bayesian rules, has a strict generalization error bound, a bounded influence function, and controllable optimality conditions, ensuring excellent classification accuracy, outlier insensitivity, and resampling stability. Additionally, RHPSVM can be extended to various advanced SVM variants by adjusting parameters, enhancing its flexibility. We transform the non-convex optimization problem of RHPSVM into a series of convex subproblems using the concave-convex procedure (CCCP) and solve it with the ClipDCD algorithm, which is proven to be convergent. Experimental results on simulated data, UCI datasets, and small-sample crop leaf image classification tasks show that RHPSVM outperforms existing SVM models in both noisy and noise-free scenarios, especially in handling high-dimensional small-sample data.


On the Effect of Regularization on Nonparametric Mean-Variance Regression

arXiv.org Machine Learning

Uncertainty quantification is vital for decision-making and risk assessment in machine learning. Mean-variance regression models, which predict both a mean and residual noise for each data point, provide a simple approach to uncertainty quantification. However, overparameterized mean-variance models struggle with signal-to-noise ambiguity, deciding whether prediction targets should be attributed to signal (mean) or noise (variance). At one extreme, models fit all training targets perfectly with zero residual noise, while at the other, they provide constant, uninformative predictions and explain the targets as noise. We observe a sharp phase transition between these extremes, driven by model regularization. Empirical studies with varying regularization levels illustrate this transition, revealing substantial variability across repeated runs. To explain this behavior, we develop a statistical field theory framework, which captures the observed phase transition in alignment with experimental results. This analysis reduces the regularization hyperparameter search space from two dimensions to one, significantly lowering computational costs. Experiments on UCI datasets and the large-scale ClimSim dataset demonstrate robust calibration performance, effectively quantifying predictive uncertainty.


Bayesian-based Online Label Shift Estimation with Dynamic Dirichlet Priors

arXiv.org Machine Learning

Label shift, a prevalent challenge in supervised learning, arises when the class prior distribution of test data differs from that of training data, leading to significant degradation in classifier performance. To accurately estimate the test priors and enhance classification accuracy, we propose a Bayesian framework for label shift estimation, termed Full Maximum A Posterior Label Shift (FMAPLS), along with its online version, online-FMAPLS. Leveraging batch and online Expectation-Maximization (EM) algorithms, these methods jointly and dynamically optimize Dirichlet hyperparameters $\boldsymbolα$ and class priors $\boldsymbolπ$, thereby overcoming the rigid constraints of the existing Maximum A Posterior Label Shift (MAPLS) approach. Moreover, we introduce a linear surrogate function (LSF) to replace gradient-based hyperparameter updates, yielding closed-form solutions that reduce computational complexity while retaining asymptotic equivalence. The online variant substitutes the batch E-step with a stochastic approximation, enabling real-time adaptation to streaming data. Furthermore, our theoretical analysis reveals a fundamental trade-off between online convergence rate and estimation accuracy. Extensive experiments on CIFAR100 and ImageNet datasets under shuffled long-tail and Dirichlet test priors demonstrate that FMAPLS and online-FMAPLS respectively achieve up to 40% and 12% lower KL divergence and substantial improvements in post-shift accuracy over state-of-the-art baselines, particularly under severe class imbalance and distributional uncertainty. These results confirm the robustness, scalability, and suitability of the proposed methods for large-scale and dynamic learning scenarios.


What If They Took the Shot? A Hierarchical Bayesian Framework for Counterfactual Expected Goals

arXiv.org Artificial Intelligence

This study develops a hierarchical Bayesian framework that integrates expert domain knowledge to quantify player-specific effects in expected goals (xG) estimation, addressing a limitation of standard models that treat all players as identical finishers. Using 9,970 shots from StatsBomb's 2015-16 data and Football Manager 2017 ratings, we combine Bayesian logistic regression with informed priors to stabilise player-level estimates, especially for players with few shots. The hierarchical model reduces posterior uncertainty relative to weak priors and achieves strong external validity: hierarchical and baseline predictions correlate at R2 = 0.75, while an XGBoost benchmark validated against StatsBomb xG reaches R2 = 0.833. The model uncovers interpretable specialisation profiles, including one-on-one finishing (Aguero, Suarez, Belotti, Immobile, Martial), long-range shooting (Pogba), and first-touch execution (Insigne, Salah, Gameiro). It also identifies latent ability in underperforming players such as Immobile and Belotti. The framework supports counterfactual "what-if" analysis by reallocating shots between players under identical contexts. Case studies show that Sansone would generate +2.2 xG from Berardi's chances, driven largely by high-pressure situations, while Vardy-Giroud substitutions reveal strong asymmetry: replacing Vardy with Giroud results in a large decline (about -7 xG), whereas the reverse substitution has only a small effect (about -1 xG). This work provides an uncertainty-aware tool for player evaluation, recruitment, and tactical planning, and offers a general approach for domains where individual skill and contextual factors jointly shape performance.