AITopics

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Neural Information Processing SystemsJan-18-2025, 08:27:55 GMT

Fine-Grained Zero-Shot Learning with DNA as Side Information

Fine-grained zero-shot learning task requires some form of side-information to transfer discriminative information from seen to unseen classes. As manually annotated visual attributes are extremely costly and often impractical to obtain for a large number of classes, in this study we use DNA as a side information for the first time for fine-grained zero-shot classification of species. Mitochondrial DNA plays an important role as a genetic marker in evolutionary biology and has been used to achieve near perfect accuracy in species classification of living organisms. We implement a simple hierarchical Bayesian model that uses DNA information to establish the hierarchy in the image space and employs local priors to define surrogate classes for unseen ones. On the benchmark CUB dataset we show that DNA can be equally promising, yet in general a more accessible alternative than word vectors as a side information.

fine-grained zero-shot learning, information, side information, (3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.42)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.42)

Neural Information Processing SystemsJan-18-2025, 05:40:06 GMT

Gaussian Process Volatility Model

The prediction of time-changing variances is an important task in the modeling of financial data. Standard econometric models are often limited as they assume rigid functional relationships for the evolution of the variance. Moreover, functional parameters are usually learned by maximum likelihood, which can lead to overfitting. To address these problems we introduce GP-Vol, a novel non-parametric model for time-changing variances based on Gaussian Processes. This new model can capture highly flexible functional relationships for the variances.

functional relationship, gaussian process volatility model, variance, (3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Dissertation Machine Learning in Materials Science -- A case study in Carbon Nanotube field effect transistors

Tan, Shulin

Carbon Nanotube has long been seen as a promising candidate for high-performance electronic material, yet its unique 1D structure leads to challenges in device fabrication. Many processing approaches have been proposed to produce better performing CNTFETs and this explosion of data needs an efficient way to explore.

artificial intelligence, machine learning, natural language, (18 more...)

2501.14813

Country: North America > United States > California (0.29)

Genre: Research Report > Experimental Study (0.45)

Industry:

Semiconductors & Electronics (1.00)
Materials (0.75)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

Unfolding Tensors to Identify the Graph in Discrete Latent Bipartite Graphical Models

Gu, Yuqi

We use a tensor unfolding technique to prove a new identifiability result for discrete bipartite graphical models, which have a bipartite graph between an observed and a latent layer. This model family includes popular models such as Noisy-Or Bayesian networks for medical diagnosis and Restricted Boltzmann Machines in machine learning. These models are also building blocks for deep generative models. Our result on identifying the graph structure enjoys the following nice properties. First, our identifiability proof is constructive, in which we innovatively unfold the population tensor under the model into matrices and inspect the rank properties of the resulting matrices to uncover the graph. This proof itself gives a population-level structure learning algorithm that outputs both the number of latent variables and the bipartite graph. Second, we allow various forms of nonlinear dependence among the variables, unlike many continuous latent variable graphical models that rely on linearity to show identifiability. Third, our identifiability condition is interpretable, only requiring each latent variable to connect to at least two "pure" observed variables in the bipartite graph. The new result not only brings novel advances in algebraic statistics, but also has useful implications for these models' trustworthy applications in scientific disciplines and interpretable machine learning.

artificial intelligence, latent variable, machine learning, (20 more...)

2501.10897

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.54)

Industry:

Health & Medicine > Diagnostic Medicine (0.48)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Adaptive Target Localization under Uncertainty using Multi-Agent Deep Reinforcement Learning with Knowledge Transfer

Alagha, Ahmed, Mizouni, Rabeb, Singh, Shakti, Bentahar, Jamal, Otrok, Hadi

Target localization is a critical task in sensitive applications, where multiple sensing agents communicate and collaborate to identify the target location based on sensor readings. Existing approaches investigated the use of Multi-Agent Deep Reinforcement Learning (MADRL) to tackle target localization. Nevertheless, these methods do not consider practical uncertainties, like false alarms when the target does not exist or when it is unreachable due to environmental complexities. To address these drawbacks, this work proposes a novel MADRL-based method for target localization in uncertain environments. The proposed MADRL method employs Proximal Policy Optimization to optimize the decision-making of sensing agents, which is represented in the form of an actor-critic structure using Convolutional Neural Networks. The observations of the agents are designed in an optimized manner to capture essential information in the environment, and a team-based reward functions is proposed to produce cooperative agents. The MADRL method covers three action dimensionalities that control the agents' mobility to search the area for the target, detect its existence, and determine its reachability. Using the concept of Transfer Learning, a Deep Learning model builds on the knowledge from the MADRL model to accurately estimating the target location if it is unreachable, resulting in shared representations between the models for faster learning and lower computational complexity. Collectively, the final combined model is capable of searching for the target, determining its existence and reachability, and estimating its location accurately. The proposed method is tested using a radioactive target localization environment and benchmarked against existing methods, showing its efficacy.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

doi: 10.1016/j.iot.2024.101447

2501.10924

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Fierro, Juan Manuel Liscano, Hortua, Hector J.

Enhancing Diagnostic in 3D COVID-19 Pneumonia CT-scans through Explainable Uncertainty Bayesian Quantification

Accurately classifying COVID-19 pneumonia in 3D CT scans remains a significant challenge in the field of medical image analysis. Although deterministic neural networks have shown promising results in this area, they provide only point estimates outputs yielding poor diagnostic in clinical decision-making. In this paper, we explore the use of Bayesian neural networks for classifying COVID-19 pneumonia in 3D CT scans providing uncertainties in their predictions. We compare deterministic networks and their Bayesian counterpart, enhancing the decision-making accuracy under uncertainty information. Remarkably, our findings reveal that lightweight architectures achieve the highest accuracy of 96\% after developing extensive hyperparameter tuning. Furthermore, the Bayesian counterpart of these architectures via Multiplied Normalizing Flow technique kept a similar performance along with calibrated uncertainty estimates. Finally, we have developed a 3D-visualization approach to explain the neural network outcomes based on SHAP values. We conclude that explainability along with uncertainty quantification will offer better clinical decisions in medical image analysis, contributing to ongoing efforts for improving the diagnosis and treatment of COVID-19 pneumonia.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2501.1077

Country:

South America > Colombia > Bogotá D.C. > Bogotá (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Raj, Amogh, Gudumotou, Carol Eunice, Bun, Sakol, Srinivasa, Keerthana, Sarshar, Arash

Deep Operator Networks for Bayesian Parameter Estimation in PDEs

arXiv.org Machine LearningJan-18-2025

We present a novel framework combining Deep Operator Networks (DeepONets) with Physics-Informed Neural Networks (PINNs) to solve partial differential equations (PDEs) and estimate their unknown parameters. By integrating data-driven learning with physical constraints, our method achieves robust and accurate solutions across diverse scenarios. Bayesian training is implemented through variational inference, allowing for comprehensive uncertainty quantification for both aleatoric and epistemic uncertainties. This ensures reliable predictions and parameter estimates even in noisy conditions or when some of the physical equations governing the problem are missing. The framework demonstrates its efficacy in solving forward and inverse problems, including the 1D unsteady heat equation and 2D reaction-diffusion equations, as well as regression tasks with sparse, noisy observations. This approach provides a computationally efficient and generalizable method for addressing uncertainty quantification in PDE surrogate modeling.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Machine Learning

2501.10684

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry: Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Neural Information Processing SystemsJan-17-2025, 23:02:10 GMT

Mesoscopic modeling of hidden spiking neurons

Can we use spiking neural networks (SNN) as generative models of multi-neuronal recordings, while taking into account that most neurons are unobserved? Modeling the unobserved neurons with large pools of hidden spiking neurons leads to severely underconstrained problems that are hard to tackle with maximum likelihood estimation. In this work, we use coarse-graining and mean-field approximations to derive a bottom-up, neuronally-grounded latent variable model (neuLVM), where the activity of the unobserved neurons is reduced to a low-dimensional mesoscopic description. In contrast to previous latent variable models, neuLVM can be explicitly mapped to a recurrent, multi-population SNN, giving it a transparent biological interpretation. We show, on synthetic spike trains, that a few observed neurons are sufficient for neuLVM to perform efficient model inversion of large SNNs, in the sense that it can recover connectivity parameters, infer single-trial latent population activity, reproduce ongoing metastable dynamics, and generalize when subjected to perturbations mimicking optogenetic stimulation.

latent variable model, mesoscopic modeling, neuron, (3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.64)

Neural Information Processing SystemsJan-17-2025, 21:33:08 GMT

Conformal Bayesian Computation

We develop scalable methods for producing conformal Bayesian predictive intervals with finite sample calibration guarantees. Bayesian posterior predictive distributions, p(y \mid x), characterize subjective beliefs on outcomes of interest, y, conditional on predictors, x . Bayesian prediction is well-calibrated when the model is true, but the predictive intervals may exhibit poor empirical coverage when the model is misspecified, under the so called {\cal{M}} -open perspective. In contrast, conformal inference provides finite sample frequentist guarantees on predictive confidence intervals without the requirement of model fidelity. Using'add-one-in' importance sampling, we show that conformal Bayesian predictive intervals are efficiently obtained from re-weighted posterior samples of model parameters.

conformal bayesian computation, conformal bayesian predictive interval, predictive interval

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.75)