AITopics | Bayesian Learning

Collaborating Authors

Bayesian Learning

A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Complement or substitute? How AI increases the demand for human skills

Mäkelä, Elina, Stephany, Fabian

arXiv.org Artificial IntelligenceDec-27-2024

The question of whether AI substitutes or complements human work is central to debates on the future of work. This paper examines the impact of AI on skill demand and compensation in the U.S. economy, analysing 12 million online job vacancies from 2018 to 2023. It investigates internal effects (within-job substitution and complementation) and external effects (across occupations, industries, and regions). Our findings reveal a significant increase in demand for AI-complementary skills, such as digital literacy, teamwork, and resilience, alongside rising wage premiums for these skills in AI roles like Data Scientist. Conversely, substitute skills, including customer service and text review, have declined in both demand and value within AI-related positions. Examining external effects, we find a notable rise in demand for complementary skills in non-AI roles linked to the growth of AI-related jobs in specific industries or regions. At the same time, there is a moderate decline in non-AI roles requiring substitute skills. Overall, AI's complementary effect is up to 50% larger than its substitution effect, resulting in net positive demand for skills. These results, replicated for the UK and Australia, highlight AI's transformative impact on workforce skill requirements. They suggest reskilling efforts should prioritise not only technical AI skills but also complementary skills like ethics and digital literacy.

data mining, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2412.19754

Country:

North America > United States (1.00)
Europe (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Education (1.00)
(2 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(6 more...)

Add feedback

Hybrid Local Causal Discovery

Ling, Zhaolong, Peng, Honghui, Zhang, Yiwen, Zhou, Peng, Wu, Xingyu, Yu, Kui, Wu, Xindong

arXiv.org Artificial IntelligenceDec-27-2024

Local causal discovery aims to learn and distinguish the direct causes and effects of a target variable from observed data. Existing constraint-based local causal discovery methods use AND or OR rules in constructing the local causal skeleton, but using either rule alone is prone to produce cascading errors in the learned local causal skeleton, and thus impacting the inference of local causal relationships. On the other hand, directly applying score-based global causal discovery methods to local causal discovery may randomly return incorrect results due to the existence of local equivalence classes. To address the above issues, we propose a Hybrid Local Causal Discovery algorithm, called HLCD. Specifically, HLCD initially utilizes a constraint-based approach combined with the OR rule to obtain a candidate skeleton and then employs a score-based method to eliminate redundant portions in the candidate skeleton. Furthermore, during the local causal orientation phase, HLCD distinguishes between V-structures and equivalence classes by comparing the local structure scores between the two, thereby avoiding orientation interference caused by local equivalence classes. We conducted extensive experiments with seven state-of-the-art competitors on 14 benchmark Bayesian network datasets, and the experimental results demonstrate that HLCD significantly outperforms existing local causal discovery algorithms.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2412.19507

Country: Asia > China (0.47)

Genre: Research Report > New Finding (0.34)

Industry: Education > Educational Setting > Higher Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)

Add feedback

Learning to Forget: Bayesian Time Series Forecasting using Recurrent Sparse Spectrum Signature Gaussian Processes

Tóth, Csaba, Adachi, Masaki, Osborne, Michael A., Oberhauser, Harald

arXiv.org Machine LearningDec-27-2024

The signature kernel is a kernel between time series of arbitrary length and comes with strong theoretical guarantees from stochastic analysis. It has found applications in machine learning such as covariance functions for Gaussian processes. A strength of the underlying signature features is that they provide a structured global description of a time series. However, this property can quickly become a curse when local information is essential and forgetting is required; so far this has only been addressed with ad-hoc methods such as slicing the time series into subsegments. To overcome this, we propose a principled, data-driven approach by introducing a novel forgetting mechanism for signatures. This allows the model to dynamically adapt its context length to focus on more recent information. To achieve this, we revisit the recently introduced Random Fourier Signature Features, and develop Random Fourier Decayed Signature Features (RFDSF) with Gaussian processes (GPs). This results in a Bayesian time series forecasting algorithm with variational inference, that offers a scalable probabilistic algorithm that processes and transforms a time series into a joint predictive distribution over time steps in one pass using recurrence. For example, processing a sequence of length $10^4$ steps in $\approx 10^{-2}$ seconds and in $< 1\text{GB}$ of GPU memory. We demonstrate that it outperforms other GP-based alternatives and competes with state-of-the-art probabilistic time series forecasting algorithms.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

2412.19727

Country:

Asia (0.67)
North America > United States (0.28)

Genre: Research Report (0.50)

Industry:

Energy > Renewable > Solar (0.68)
Health & Medicine (0.67)
Energy > Power Industry (0.67)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

Solving High-dimensional Inverse Problems Using Amortized Likelihood-free Inference with Noisy and Incomplete Data

Zeng, Jice, Wang, Yuanzhe, Tartakovsky, Alexandre M., Barajas-Solano, David

arXiv.org Artificial IntelligenceDec-26-2024

We present a likelihood-free probabilistic inversion method based on normalizing flows for high-dimensional inverse problems. The proposed method is composed of two complementary networks: a summary network for data compression and an inference network for parameter estimation. The summary network encodes raw observations into a fixed-size vector of summary features, while the inference network generates samples of the approximate posterior distribution of the model parameters based on these summary features. The posterior samples are produced in a deep generative fashion by sampling from a latent Gaussian distribution and passing these samples through an invertible transformation. We construct this invertible transformation by sequentially alternating conditional invertible neural network and conditional neural spline flow layers. The summary and inference networks are trained simultaneously. We apply the proposed method to an inversion problem in groundwater hydrology to estimate the posterior distribution of the log-conductivity field conditioned on spatially sparse time-series observations of the system's hydraulic head responses.The conductivity field is represented with 706 degrees of freedom in the considered problem.The comparison with the likelihood-based iterative ensemble smoother PEST-IES method demonstrates that the proposed method accurately estimates the parameter posterior distribution and the observations' predictive posterior distribution at a fraction of the inference time of PEST-IES.

artificial intelligence, bayesian inference, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2412.04565

Country: North America > United States > Illinois (0.28)

Genre:

Research Report (1.00)
Workflow (0.68)

Industry:

Health & Medicine (1.00)
Energy > Oil & Gas > Upstream (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Assessing Pre-trained Models for Transfer Learning through Distribution of Spectral Components

Zhang, Tengxue, Shu, Yang, Chen, Xinyang, Long, Yifei, Guo, Chenjuan, Yang, Bin

arXiv.org Artificial IntelligenceDec-26-2024

Pre-trained model assessment for transfer learning aims to identify the optimal candidate for the downstream tasks from a model hub, without the need of time-consuming fine-tuning. Existing advanced works mainly focus on analyzing the intrinsic characteristics of the entire features extracted by each pre-trained model or how well such features fit the target labels. This paper proposes a novel perspective for pre-trained model assessment through the Distribution of Spectral Components (DISCO). Through singular value decomposition of features extracted from pre-trained models, we investigate different spectral components and observe that they possess distinct transferability, contributing diversely to the fine-tuning performance. Inspired by this, we propose an assessment method based on the distribution of spectral components which measures the proportions of their corresponding singular values. Pre-trained models with features concentrating on more transferable components are regarded as better choices for transfer learning. We further leverage the labels of downstream data to better estimate the transferability of each spectral component and derive the final assessment criterion. Our proposed method is flexible and can be applied to both classification and regression tasks. We conducted comprehensive experiments across three benchmarks and two tasks including image classification and object detection, demonstrating that our method achieves state-of-the-art performance in choosing proper pre-trained models from the model hub for transfer learning.

artificial intelligence, machine learning, spectral component, (18 more...)

arXiv.org Artificial Intelligence

2412.19085

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Asymptotically Optimal Search for a Change Point Anomaly under a Composite Hypothesis Model

Didi, Liad Lea, Gafni, Tomer, Cohen, Kobi

arXiv.org Machine LearningDec-26-2024

We address the problem of searching for a change point in an anomalous process among a finite set of M processes. Specifically, we address a composite hypothesis model in which each process generates measurements following a common distribution with an unknown parameter (vector). This parameter belongs to either a normal or abnormal space depending on the current state of the process. Before the change point, all processes, including the anomalous one, are in a normal state; after the change point, the anomalous process transitions to an abnormal state. Our goal is to design a sequential search strategy that minimizes the Bayes risk by balancing sample complexity and detection accuracy. We propose a deterministic search algorithm with the following notable properties. First, we analytically demonstrate that when the distributions of both normal and abnormal processes are unknown, the algorithm is asymptotically optimal in minimizing the Bayes risk as the error probability approaches zero. In the second setting, where the parameter under the null hypothesis is known, the algorithm achieves asymptotic optimality with improved detection time based on the true normal state. Simulation results are presented to validate the theoretical findings.

artificial intelligence, detection, machine learning, (17 more...)

arXiv.org Machine Learning

2412.19392

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Provable Uncertainty Decomposition via Higher-Order Calibration

Ahdritz, Gustaf, Gollakota, Aravind, Gopalan, Parikshit, Peale, Charlotte, Wieder, Udi

arXiv.org Machine LearningDec-25-2024

We give a principled method for decomposing the predictive uncertainty of a model into aleatoric and epistemic components with explicit semantics relating them to the real-world data distribution. While many works in the literature have proposed such decompositions, they lack the type of formal guarantees we provide. Our method is based on the new notion of higher-order calibration, which generalizes ordinary calibration to the setting of higher-order predictors that predict mixtures over label distributions at every point. We show how to measure as well as achieve higher-order calibration using access to $k$-snapshots, namely examples where each point has $k$ independent conditional labels. Under higher-order calibration, the estimated aleatoric uncertainty at a point is guaranteed to match the real-world aleatoric uncertainty averaged over all points where the prediction is made. To our knowledge, this is the first formal guarantee of this type that places no assumptions whatsoever on the real-world data distribution. Importantly, higher-order calibration is also applicable to existing higher-order predictors such as Bayesian and ensemble models and provides a natural evaluation metric for such models. We demonstrate through experiments that our method produces meaningful uncertainty decompositions for image classification.

artificial intelligence, calibration, machine learning, (19 more...)

arXiv.org Machine Learning

2412.18808

Country: Europe (0.45)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

PhyloGen: Language Model-Enhanced Phylogenetic Inference via Graph Structure Generation

Duan, ChenRui, Zang, Zelin, Li, Siyuan, Xu, Yongjie, Li, Stan Z.

arXiv.org Artificial IntelligenceDec-25-2024

Phylogenetic trees elucidate evolutionary relationships among species, but phylogenetic inference remains challenging due to the complexity of combining continuous (branch lengths) and discrete parameters (tree topology). Traditional Markov Chain Monte Carlo methods face slow convergence and computational burdens. Existing Variational Inference methods, which require pre-generated topologies and typically treat tree structures and branch lengths independently, may overlook critical sequence features, limiting their accuracy and flexibility. We propose PhyloGen, a novel method leveraging a pre-trained genomic language model to generate and optimize phylogenetic trees without dependence on evolutionary models or aligned sequence constraints. PhyloGen views phylogenetic inference as a conditionally constrained tree structure generation problem, jointly optimizing tree topology and branch lengths through three core modules: (i) Feature Extraction, (ii) PhyloTree Construction, and (iii) PhyloTree Structure Modeling. Meanwhile, we introduce a Scoring Function to guide the model towards a more stable gradient descent. We demonstrate the effectiveness and robustness of PhyloGen on eight real-world benchmark datasets. Visualization results confirm PhyloGen provides deeper insights into phylogenetic relationships.

artificial intelligence, machine learning, tree topology, (20 more...)

arXiv.org Artificial Intelligence

2412.18827

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Bayesian Critique-Tune-Based Reinforcement Learning with Adaptive Pressure for Multi-Intersection Traffic Signal Control

Duan, Wenchang, Gao, Zhenguo, He, Jiwan, Xian, Jinguo

arXiv.org Artificial IntelligenceDec-25-2024

Adaptive Traffic Signal Control (ATSC) system is a critical component of intelligent transportation, with the capability to significantly alleviate urban traffic congestion. Although reinforcement learning (RL)-based methods have demonstrated promising performance in achieving ATSC, existing methods are still prone to making unreasonable policies. Therefore, this paper proposes a novel Bayesian Critique-Tune-Based Reinforcement Learning with Adaptive Pressure for multi-intersection signal control (BCT-APLight). In BCT-APLight, the Critique-Tune (CT) framework, a two-layer Bayesian structure is designed to refine the excessive trust of RL policies. Specifically, the Bayesian inference-based Critique Layer provides effective evaluations of the credibility of policies; the Bayesian decision-based Tune Layer fine-tunes policies by minimizing the posterior risks when the evaluations are negative. Meanwhile, an attention-based Adaptive Pressure (AP) mechanism is designed to effectively weight the vehicle queues in each lane, thereby enhancing the rationality of traffic movement representation within the network. Equipped with the CT framework and AP mechanism, BCT-APLight effectively enhances the reasonableness of RL policies. Extensive experiments conducted with a simulator across a range of intersection layouts demonstrate that BCT-APLight is superior to other state-of-the-art (SOTA) methods on seven real-world datasets. Specifically, BCT-APLight decreases average queue length by \textbf{$\boldsymbol{9.60\%}$} and average waiting time by \textbf{$\boldsymbol{15.28\%}$}.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

2412.16225

Country:

Asia > China (0.48)
North America > United States (0.47)

Genre:

Research Report (1.00)
Overview (0.68)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Machine Learning and Deep Learning Techniques used in Cybersecurity and Digital Forensics: a Review

Fattahi, Jaouhar

arXiv.org Artificial IntelligenceDec-24-2024

In the paced realms of cybersecurity and digital forensics machine learning (ML) and deep learning (DL) have emerged as game changing technologies that introduce methods to identify stop and analyze cyber risks. This review presents an overview of the ML and DL approaches used in these fields showcasing their advantages drawbacks and possibilities. It covers a range of AI techniques used in spotting intrusions in systems and classifying malware to prevent cybersecurity attacks, detect anomalies and enhance resilience. This study concludes by highlighting areas where further research is needed and suggesting ways to create transparent and scalable ML and DL solutions that are suited to the evolving landscape of cybersecurity and digital forensics.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2501.0325

Country:

North America > Canada > Quebec > Montreal (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Wisconsin (0.04)
(13 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback