AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)

Neural Information Processing SystemsAug-12-2025, 22:23:54 GMT

Tractable Bayesian Network Structure Learning with Bounded Vertex Cover Number

Both learning and inference tasks on Bayesian networks are NP-hard in general. Bounded tree-width Bayesian networks have recently received a lot of attention as a way to circumvent this complexity issue; however, while inference on bounded tree-width networks is tractable, the learning problem remains NP-hard even for tree-width~2. In this paper, we propose bounded vertex cover number Bayesian networks as an alternative to bounded tree-width networks. In particular, we show that both inference and learning can be done in polynomial time for any fixed vertex cover number bound $k$, in contrast to the general and bounded tree-width cases; on the other hand, we also show that learning problem is W[1]-hard in parameter $k$. Furthermore, we give an alternative way to learn bounded vertex cover number Bayesian networks using integer linear programming (ILP), and show this is feasible in practice.

bounded vertex cover number, name change, tractable bayesian network structure learning, (3 more...)

Industry: Education > Focused Education > Special Education (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Neural Information Processing SystemsAug-12-2025, 21:56:45 GMT

On-the-Job Learning with Bayesian Decision Theory

Our goal is to deploy a high-accuracy system starting with zero training examples. We consider an "on-the-job" setting, where as inputs arrive, we use real-time crowdsourcing to resolve uncertainty where needed and output our prediction when confident. As the model improves over time, the reliance on crowdsourcing queries decreases. We cast our setting as a stochastic game based on Bayesian decision theory, which allows us to balance latency, cost, and accuracy objectives in a principled way. Computing the optimal policy is intractable, so we develop an approximation based on Monte Carlo Tree Search. We tested our approach on three datasets-- named-entity recognition, sentiment classification, and image classification. On the NER task we obtained more than an order of magnitude reduction in cost compared to full human annotation, while boosting performance relative to the expert provided labels. We also achieve a 8% F1 improvement over having a single human label the whole set, and a 28% F1 improvement over online learning.

Neural Information Processing SystemsAug-12-2025, 21:48:33 GMT

Learning Bayesian Networks with Thousands of Variables

We present a method for learning Bayesian networks from data sets containingthousands of variables without the need for structure constraints. Our approachis made of two parts. The first is a novel algorithm that effectively explores thespace of possible parent sets of a node.

electronic proceedings, learning bayesian network, name change, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Neural Information Processing SystemsAug-12-2025, 21:41:24 GMT

Fast Classification Rates for High-dimensional Gaussian Generative Models

We consider the problem of binary classification when the covariates conditioned on the each of the response values follow multivariate Gaussian distributions. We focus on the setting where the covariance matrices for the two conditional distributions are the same. The corresponding generative model classifier, derived via the Bayes rule, also called Linear Discriminant Analysis, has been shown to behave poorly in high-dimensional settings. We present a novel analysis of the classification error of any linear discriminant approach given conditional Gaussian models. This allows us to compare the generative model classifier, other recently proposed discriminative approaches that directly learn the discriminant function, and then finally logistic regression which is another classical discriminative model classifier. As we show, under a natural sparsity assumption, and letting $s$ denote the sparsity of the Bayes classifier, $p$ the number of covariates, and $n$ the number of samples, the simple ($\ell_1$-regularized) logistic regression classifier achieves the fast misclassification error rates of $O\left(\frac{s \log p}{n}\right)$, which is much better than the other approaches, which are either inconsistent under high-dimensional settings, or achieve a slower rate of $O\left(\sqrt{\frac{s \log p}{n}}\right)$.

classifier, fast classification rate, high-dimensional gaussian generative model, (6 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.60)

arXiv.org Machine LearningAug-12-2025

Uncertainty-Driven Reliability: Selective Prediction and Trustworthy Deployment in Modern Machine Learning

Rabanser, Stephan

Machine learning (ML) systems are increasingly deployed in high-stakes domains where reliability is paramount. This thesis investigates how uncertainty estimation can enhance the safety and trustworthiness of ML, focusing on selective prediction -- where models abstain when confidence is low. We first show that a model's training trajectory contains rich uncertainty signals that can be exploited without altering its architecture or loss. By ensembling predictions from intermediate checkpoints, we propose a lightweight, post-hoc abstention method that works across tasks, avoids the cost of deep ensembles, and achieves state-of-the-art selective prediction performance. Crucially, this approach is fully compatible with differential privacy (DP), allowing us to study how privacy noise affects uncertainty quality. We find that while many methods degrade under DP, our trajectory-based approach remains robust, and we introduce a framework for isolating the privacy-uncertainty trade-off. Next, we then develop a finite-sample decomposition of the selective classification gap -- the deviation from the oracle accuracy-coverage curve -- identifying five interpretable error sources and clarifying which interventions can close the gap. This explains why calibration alone cannot fix ranking errors, motivating methods that improve uncertainty ordering. Finally, we show that uncertainty signals can be adversarially manipulated to hide errors or deny service while maintaining high accuracy, and we design defenses combining calibration audits with verifiable inference. Together, these contributions advance reliable ML by improving, evaluating, and safeguarding uncertainty estimation, enabling models that not only make accurate predictions -- but also know when to say "I do not know".

artificial intelligence, inductive learning, selective classification training dynamic, (18 more...)

arXiv.org Machine Learning

2508.07556

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.13)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)
Health & Medicine > Diagnostic Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(5 more...)

Taherpour, Amirhossein, Taherpour, Abbas, Khattab, Tamer

Adaptive Learning for IRS-Assisted Wireless Networks: Securing Opportunistic Communications Against Byzantine Eavesdroppers

We propose a joint learning framework for Byzantine-resilient spectrum sensing and secure intelligent reflecting surface (IRS)--assisted opportunistic access under channel state information (CSI) uncertainty. The sensing stage performs logit-domain Bayesian updates with trimmed aggregation and attention-weighted consensus, and the base station (BS) fuses network beliefs with a conservative minimum rule, preserving detection accuracy under a bounded number of Byzantine users. Conditioned on the sensing outcome, we pose downlink design as sum mean-squared error (MSE) minimization under transmit-power and signal-leakage constraints and jointly optimize the BS precoder, IRS phase shifts, and user equalizers. With partial (or known) CSI, we develop an augmented-Lagrangian alternating algorithm with projected updates and provide provable sublinear convergence, with accelerated rates under mild local curvature. With unknown CSI, we perform constrained Bayesian optimization (BO) in a geometry-aware low-dimensional latent space using Gaussian process (GP) surrogates; we prove regret bounds for a constrained upper confidence bound (UCB) variant of the BO module, and demonstrate strong empirical performance of the implemented procedure. Simulations across diverse network conditions show higher detection probability at fixed false-alarm rate under adversarial attacks, large reductions in sum MSE for honest users, strong suppression of eavesdropper signal power, and fast convergence. The framework offers a practical path to secure opportunistic communication that adapts to CSI availability while coherently coordinating sensing and transmission through joint learning.

artificial intelligence, constraint, machine learning, (17 more...)

2508.08206

Country: North America > United States (0.94)

Genre: Research Report (1.00)

Industry:

Government > Tax (0.85)
Government > Regional Government > North America Government > United States Government (0.85)
Information Technology > Security & Privacy (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Neha, Samiha Afaf, Bhuiyan, Abir Ahammed, Khan, Md. Ishrak

ProteoKnight: Convolution-based phage virion protein classification and uncertainty analysis

Proteins are essential macromolecules responsible for the structural and functional mechanisms of living things. Phage structural proteins are a subclass of the protein family corresponding to the bacteriophages, which are the most prevalent kind of biological creatures [1]. The PVPs are mainly associated with building structural constituents of the bacteriophage [2], such as the baseplate and head as shown in Figure 1, enabling efficient host-phage binding for genome transfer. The amino acid sequences, particularly those involved in structure synthesis, exhibit significant diversity among phages and their respective groups [3]. Consequently, characterizing these sequences becomes a challenging yet crucial task, as the shortage of annotations for phage proteins has emerged as a hindrance in numerous research studies focused on phage genomics [4]. With the advent of high-throughput sequencing technology, rapid additions of sequences are being observed in standard biological databases [6] as shown in Figure 1, giving rise to the need for proper annotation of the sequences. Enhancement of annotations for the phage family is vital for exploring effective anti-bacterial drug synthesis [7, 8], disease diagnosis [9], food production [10], bacterial genome remodeling [11], etc. Traditionally, mass spectrometry and protein array techniques [12, 13] were utilized for characterizing these proteins. However, these methods are associated with significant time, labor, and computational expenses [1]. Similarly, alignment-based methods do not always work well with PVP categorization due to the lack of collinearity [14] in viral genomes, resulting from factors such as horizontal gene transfer, high mutation rates, etc., leading to dissimilar sequences 2 Figure 1 Structural constituents of Phage

bioinformatics, machine learning, natural language, (18 more...)

2508.07345

Genre:

Research Report (1.00)
Overview (0.93)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)
Health & Medicine > Therapeutic Area > Oncology (0.93)
Health & Medicine > Therapeutic Area > Immunology (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(5 more...)

A Two-stage Optimization Method for Wide-range Single-electron Quantum Magnetic Sensing

Guo, Shiqian, Liu, Jianqing, Le, Thinh, Dai, Huaiyu

Quantum magnetic sensing based on spin systems has emerged as a new paradigm for detecting ultra-weak magnetic fields with unprecedented sensitivity, revitalizing applications in navigation, geo-localization, biology, and beyond. At the heart of quantum magnetic sensing, from the protocol perspective, lies the design of optimal sensing parameters to manifest and then estimate the underlying signals of interest (SoI). Existing studies on this front mainly rely on adaptive algorithms based on black-box AI models or formula-driven principled searches. However, when the SoI spans a wide range and the quantum sensor has physical constraints, these methods may fail to converge efficiently or optimally, resulting in prolonged interrogation times and reduced sensing accuracy. In this work, we report the design of a new protocol using a two-stage optimization method. In the 1st Stage, a Bayesian neural network with a fixed set of sensing parameters is used to narrow the range of SoI. In the 2nd Stage, a federated reinforcement learning agent is designed to fine-tune the sensing parameters within a reduced search space. The proposed protocol is developed and evaluated in a challenging context of single-shot readout of an NV-center electron spin under a constrained total sensing time budget; and yet it achieves significant improvements in both accuracy and resource efficiency for wide-range D.C. magnetic field estimation compared to the state of the art.

artificial intelligence, machine learning, reinforcement learning, (21 more...)

2506.13469

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

Nafar, Aliakbar, Venable, Kristen Brent, Cui, Zijun, Kordjamshidi, Parisa

Extracting Probabilistic Knowledge from Large Language Models for Bayesian Network Parameterization

In this work, we evaluate the potential of Large Language Models (LLMs) in building Bayesian Networks (BNs) by approximating domain expert priors. LLMs have demonstrated potential as factual knowledge bases; however, their capability to generate probabilistic knowledge about real-world events remains understudied. We explore utilizing the probabilistic knowledge inherent in LLMs to derive probability estimates for statements regarding events and their relationships within a BN. Using LLMs in this context allows for the parameterization of BNs, enabling probabilistic modeling within specific domains. Our experiments on eighty publicly available Bayesian Networks, from healthcare to finance, demonstrate that querying LLMs about the conditional probabilities of events provides meaningful results when compared to baselines, including random and uniform distributions, as well as approaches based on next-token generation probabilities. We explore how these LLM-derived distributions can serve as expert priors to refine distributions extracted from data, especially when data is scarce. Overall, this work introduces a promising strategy for automatically constructing Bayesian Networks by combining probabilistic knowledge extracted from LLMs with real-world data. Additionally, we establish the first comprehensive baseline for assessing LLM performance in extracting probabilistic knowledge.

kl divergence, large language model, machine learning, (21 more...)

2505.15918

Country:

North America > United States (0.28)
Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)