Goto

Collaborating Authors

 Bayesian Learning


Efficient Hate Speech Detection: Evaluating 38 Models from Traditional Methods to Transformers

arXiv.org Artificial Intelligence

The proliferation of hate speech on social media necessitates automated detection systems that balance accuracy with computational efficiency. This study evaluates 38 model configurations in detecting hate speech across datasets ranging from 6.5K to 451K samples. We analyze transformer architectures (e.g., BERT, RoBERTa, Distil-BERT), deep neural networks (e.g., CNN, LSTM, GRU, Hierarchical Attention Networks), and traditional machine learning methods (e.g., SVM, CatBoost, Random Forest). Our results show that transformers, particularly RoBERTa, consistently achieve superior performance with accuracy and F1-scores exceeding 90%. Among deep learning approaches, Hierarchical Attention Networks yield the best results, while traditional methods like CatBoost and SVM remain competitive, achieving F1-scores above 88% with significantly lower computational costs. Additionally, our analysis highlights the importance of dataset characteristics, with balanced, moderately sized unprocessed datasets outperforming larger, preprocessed datasets. These findings offer valuable insights for developing efficient and effective hate speech detection systems.


Proximity-Based Evidence Retrieval for Uncertainty-Aware Neural Networks

arXiv.org Artificial Intelligence

Abstract--This work proposes an evidence-retrieval mechanism for uncertainty-aware decision-making that replaces a single global cutoff with an evidence-conditioned, instance-adaptive criterion. For each test instance, proximal exemplars are retrieved in an embedding space; their predictive distributions are fused via Dempster-Shafer theory. Because the supporting evidences are explicit, decisions are transparent and auditable. Experiments on CIF AR-10/100 with BiT and ViT backbones show higher or comparable uncertainty-aware performance with materially fewer confidently incorrect outcomes and a sustainable review load compared with applying threshold on prediction entropy. Notably, only a few evidences are sufficient to realize these gains; increasing the evidence set yields only modest changes. These results indicate that evidence-conditioned tagging provides a more reliable and interpretable alternative to fixed prediction entropy thresholds for operational uncertainty-aware decision-making. N the landscape of modern artificial intelligence (AI), the pursuit of predictive accuracy has driven neural networks (NNs) to achieve superhuman performance across a multitude of domains. However, in many real-world applications, particularly those with high stakes, a correct prediction is only part of the requirement. This is crucial because most conventional machine learning (ML) models issue single-point predictions. In particular, NNs typically output class probabilities through a softmax layer, which represent only a deterministic point estimate conditioned on the model's fixed parameters and training data. These probabilities reflect the model's relative preference among classes given its fixed state after training. High probability does not necessarily imply that the prediction is reliable. This is where uncertainty quantification (UQ) methods emerges as a critical paradigm.


Physics-based deep kernel learning for parameter estimation in high dimensional PDEs

arXiv.org Artificial Intelligence

Inferring parameters of high-dimensional partial differential equations (PDEs) poses significant computational and inferential challenges, primarily due to the curse of dimensionality and the inherent limitations of traditional numerical methods. This paper introduces a novel two-stage Bayesian framework that synergistically integrates training, physics-based deep kernel learning (DKL) with Hamiltonian Monte Carlo (HMC) to robustly infer unknown PDE parameters and quantify their uncertainties from sparse, exact observations. The first stage leverages physics-based DKL to train a surrogate model, which jointly yields an optimized neural network feature extractor and robust initial estimates for the PDE parameters. In the second stage, with the neural network weights fixed, HMC is employed within a full Bayesian framework to efficiently sample the joint posterior distribution of the kernel hyperparameters and the PDE parameters. Numerical experiments on canonical and high-dimensional inverse PDE problems demonstrate that our framework accurately estimates parameters, provides reliable uncertainty estimates, and effectively addresses challenges of data sparsity and model complexity, offering a robust and scalable tool for diverse scientific and engineering applications.


CrowdAgent: Multi-Agent Managed Multi-Source Annotation System

arXiv.org Artificial Intelligence

High-quality annotated data is a cornerstone of modern Natural Language Processing (NLP). While recent methods begin to leverage diverse annotation sources-including Large Language Models (LLMs), Small Language Models (SLMs), and human experts-they often focus narrowly on the labeling step itself. A critical gap remains in the holistic process control required to manage these sources dynamically, addressing complex scheduling and quality-cost trade-offs in a unified manner. Inspired by real-world crowdsourcing companies, we introduce CrowdAgent, a multi-agent system that provides end-to-end process control by integrating task assignment, data annotation, and quality/cost management. It implements a novel methodology that rationally assigns tasks, enabling LLMs, SLMs, and human experts to advance synergistically in a collaborative annotation workflow. We demonstrate the effectiveness of CrowdAgent through extensive experiments on six diverse multimodal classification tasks. The source code and video demo are available at https://github.com/QMMMS/CrowdAgent.


A Conformal Prediction Framework for Uncertainty Quantification in Physics-Informed Neural Networks

arXiv.org Artificial Intelligence

Physics-Informed Neural Networks (PINNs) have emerged as a powerful framework for solving PDEs, yet existing uncertainty quantification (UQ) approaches for PINNs generally lack rigorous statistical guarantees. This framework calibrates prediction intervals by constructing nonconformity scores on a calibration set, thereby yielding distribution-free uncertainty estimates with rigorous finite-sample coverage guarantees for PINNs. To handle spatial het-eroskedasticity, we further introduce local conformal quantile estimation, enabling spatially adaptive uncertainty bands while preserving theoretical guarantee. Through systematic evaluations on typical PDEs (damped harmonic oscillator, Poisson, Allen-Cahn, and Helmholtz equations) and comprehensive testing across multiple uncertainty metrics, our results demonstrate that the proposed framework achieves reliable calibration and locally adaptive uncertainty intervals, consistently outperforming heuristic UQ approaches. By bridging PINNs with distribution-free UQ, this work introduces a general framework that not only enhances calibration and reliability, but also opens new avenues for uncertainty-aware modeling of complex PDE systems.1. Introduction Physics-Informed Neural Networks (PINNs) have emerged as a versatile framework for solving partial differential equations (PDEs) by embedding physical laws into neural network training [1, 2]. Numerous variants have been developed to enhance accuracy, efficiency, and applicability [3, 4, 5, 6, 7, 8], enabling PINNs to address complex geometries [9, 10], high-dimensional and multiscale problems [11, 12, 13], and inverse formulations [14, 15] within a unified mesh-free paradigm. Applications span fluid mechanics [16, 17], heat transfer [18, 19], and materials science [20, 21]; see [16, 22, 23, 24, 25] for comprehensive reviews.


Learning Discrete Bayesian Networks with Hierarchical Dirichlet Shrinkage

arXiv.org Machine Learning

Discrete Bayesian networks (DBNs) provide a broadly useful framework for modeling dependence structures in multivariate categorical data. There is a vast literature on methods for inferring conditional probabilities and graphical structure in DBNs, but data sparsity and parametric assumptions are major practical issues. In this article, we detail a comprehensive Bayesian framework for learning DBNs. First, we propose a hierarchical prior for the conditional probabilities that enables complicated interactions between parent variables and stability in sparse regimes. We give a novel Markov chain Monte Carlo (MCMC) algorithm utilizing parallel Langevin proposals to generate exact posterior samples, avoiding the pitfalls of variational approximations. Moreover, we verify that the full conditional distribution of the concentration parameters is log-concave under mild conditions, facilitating efficient sampling. We then propose two methods for learning network structures, including parent sets, Markov blankets, and DAGs, from categorical data. The first cycles through individual edges each MCMC iteration, whereas the second updates the entire structure as a single step. We evaluate the accuracy, power, and MCMC performance of our methods on several simulation studies. Finally, we apply our methodology to uncover prognostic network structure from primary breast cancer samples.


Bayesian Parametric Matrix Models: Principled Uncertainty Quantification for Spectral Learning

arXiv.org Machine Learning

Scientific machine learning increasingly uses spectral methods to understand physical systems. Current spectral learning approaches provide only point estimates without uncertainty quantification, limiting their use in safety-critical applications where prediction confidence is essential. Parametric matrix models have emerged as powerful tools for scientific machine learning, achieving exceptional performance by learning governing equations. However, their deterministic nature limits deployment in uncertainty quantification applications. We introduce Bayesian parametric matrix models (B-PMMs), a principled framework that extends PMMs to provide uncertainty estimates while preserving their spectral structure and computational efficiency. B-PMM addresses the fundamental challenge of quantifying uncertainty in matrix eigenvalue problems where standard Bayesian methods fail due to the geometric constraints of spectral decomposition. The theoretical contributions include: (i) adaptive spectral decomposition with regularized matrix perturbation bounds that characterize eigenvalue uncertainty propagation, (ii) structured variational inference algorithms using manifold-aware matrix-variate Gaussian posteriors that respect Hermitian constraints, and (iii) finite-sample calibration guarantees with explicit dependence on spectral gaps and problem conditioning. Experimental validation across matrix dimensions from 5x5 to 500x500 with perfect convergence rates demonstrates that B-PMMs achieve exceptional uncertainty calibration (ECE < 0.05) while maintaining favorable scaling. The framework exhibits graceful degradation under spectral ill-conditioning and provides reliable uncertainty estimates even in near-degenerate regimes. The proposed framework supports robust spectral learning in uncertainty-critical domains and lays the groundwork for broader Bayesian spectral machine learning.


On the Correlation between Individual Fairness and Predictive Accuracy in Probabilistic Models

arXiv.org Artificial Intelligence

We investigate individual fairness in generative probabilistic classifiers by analysing the robustness of posterior inferences to perturbations in private features. Building on established results in robustness analysis, we hypothesise a correlation between robustness and predictive accuracy, specifically, instances exhibiting greater robustness are more likely to be classified accurately. We empirically assess this hypothesis using a benchmark of fourteen datasets with fairness concerns, employing Bayesian networks as the underlying generative models. To address the computational complexity associated with robustness analysis over multiple private features with Bayesian networks, we reformulate the problem as a most probable explanation task in an auxiliary Markov random field. Our experiments confirm the hypothesis about the correlation, suggesting novel directions to mitigate the traditional trade-off between fairness and accuracy.


EmbeddedML: A New Optimized and Fast Machine Learning Library

arXiv.org Artificial Intelligence

Machine learning models and libraries can train datasets of different sizes and perform prediction and classification operations, but machine learning models and libraries cause slow and long training times on large datasets. This article introduces EmbeddedML, a training-time-optimized and mathematically enhanced machine learning library. The speed was increased by approximately times compared to scikit-learn without any loss in terms of accuracy in regression models such as Multiple Linear Regression. Logistic Regression and Support Vector Machines (SVM) algorithms have been mathematically rewritten to reduce training time and increase accuracy in classification models. With the applied mathematical improvements, training time has been reduced by approximately 2 times for SVM on small datasets and by around 800 times on large datasets, and by approximately 4 times for Logistic Regression, compared to the scikit-learn implementation. In summary, the EmbeddedML library offers regression, classification, clustering, and dimensionality reduction algorithms that are mathematically rewritten and optimized to reduce training time.


Toward Ownership Understanding of Objects: Active Question Generation with Large Language Model and Probabilistic Generative Model

arXiv.org Artificial Intelligence

Robots operating in daily life environments must understand object ownership to carry out instructions naturally given by users, such as "Bring me my cup." Without ownership knowledge, a robot cannot determine which object is being referred to when multiple similar objects exist. This problem is especially evident in kitchens, offices, or laboratories, where objects with similar appearances may belong to different individuals. Relying solely on perceptual features such as location or appearance is insufficient because ownership is inherently context-dependent and often determined by social conventions. Therefore, enabling robots to acquire ownership knowledge is a crucial step toward socially appropriate human-robot interaction. To enable robots to learn object ownership in daily life environments, it is essential to implement a question-generation mechanism that efficiently acquires necessary information. However, in real-world environments with large numbers of objects, this is impractical and imposes a heavy burden on users. Although robots can explore the environment to collect visual features of objects, it remains difficult to obtain ownership knowledge because it depends on users and context. Therefore, allowing robots to ask questions based on the current situation enables them to acquire ownership knowl-Saki Hashimoto is the presenter of this paper.