AITopics

Score distillation sampling has been pivotal for integrating diffusion models into generation of complex visuals. Despite impressive results it suffers from mode collapse and lack of diversity. To cope with this challenge, we leverage the gradient flow interpretation of score distillation to propose Repulsive Score Distillation (RSD). In particular, we propose a variational framework based on repulsion of an ensemble of particles that promotes diversity. Using a variational approximation that incorporates a coupling among particles, the repulsion appears as a simple regularization that allows interaction of particles based on their relative pairwise similarity, measured e.g., via radial basis kernels. We design RSD for both unconstrained and constrained sampling scenarios. For constrained sampling we focus on inverse problems in the latent space that leads to an augmented variational formulation, that strikes a good balance between compute, quality and diversity. Our extensive experiments for text-to-image generation, and inverse problems demonstrate that RSD achieves a superior trade-off between diversity and quality compared with state-of-the-art alternatives.

diffusion model, inverse problem, repulsion, (14 more...)

2406.16683

Country:

North America > United States (0.28)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > Middle East > Jordan (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.50)

Industry: Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Sun, Yuyang, Lei, Jingyu, Kosmas, Panagiotis

Exploring Biomarker Relationships in Both Type 1 and Type 2 Diabetes Mellitus Through a Bayesian Network Analysis Approach

Understanding the complex relationships of biomarkers in diabetes is pivotal for advancing treatment strategies, a pressing need in diabetes research. This study applies Bayesian network structure learning to analyze the Shanghai Type 1 and Type 2 diabetes mellitus datasets, revealing complex relationships among key diabetes-related biomarkers. The constructed Bayesian network presented notable predictive accuracy, particularly for Type 2 diabetes mellitus, with root mean squared error (RMSE) of 18.23 mg/dL, as validated through leave-one-domain experiments and Clarke error grid analysis. This study not only elucidates the intricate dynamics of diabetes through a deeper understanding of biomarker interplay but also underscores the significant potential of integrating data-driven and knowledge-driven methodologies in the realm of personalized diabetes management. Such an approach paves the way for more custom and effective treatment strategies, marking a notable advancement in the field.

bayesian network, biomarker, dataset, (16 more...)

2406.1709

Country:

Asia > China > Shanghai > Shanghai (0.25)
North America > United States > California > Orange County > Irvine (0.14)
Europe > United Kingdom > England > Greater London > London (0.05)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Datta, Jyotishka, Polson, Nicholas G.

Bayesian Deep ICE

Deep Independent Component Estimation (DICE) has many applications in modern day machine learning as a feature engineering extraction method. We provide a novel latent variable representation of independent component analysis that enables both point estimates via expectation-maximization (EM) and full posterior sampling via Markov Chain Monte Carlo (MCMC) algorithms. Our methodology also applies to flow-based methods for nonlinear feature extraction. We discuss how to implement conditional posteriors and envelope-based methods for optimization. Through this representation hierarchy, we unify a number of hitherto disjoint estimation procedures. We illustrate our methodology and algorithms on a numerical example. Finally, we conclude with directions for future research.

algorithm, mackay, source distribution, (13 more...)

2406.17058

Country:

North America > United States > Virginia (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Jordan (0.04)
(3 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Beyond Demographics: Aligning Role-playing LLM-based Agents Using Human Belief Networks

Chuang, Yun-Shiuan, Studdiford, Zach, Nirunwiroj, Krirk, Goyal, Agam, Frigo, Vincent V., Yang, Sijia, Shah, Dhavan, Hu, Junjie, Rogers, Timothy T.

Creating human-like large language model (LLM) agents is crucial for faithful social simulation. Having LLMs role-play based on demographic information sometimes improves human likeness but often does not. This study assessed whether LLM alignment with human behavior can be improved by integrating information from empirically-derived human belief networks. Using data from a human survey, we estimated a belief network encompassing 18 topics loading on two non-overlapping latent factors. We then seeded LLM-based agents with an opinion on one topic, and assessed the alignment of its expressed opinions on remaining test topics with corresponding human data. Role-playing based on demographic information alone did not align LLM and human opinions, but seeding the agent with a single belief greatly improved alignment for topics related in the belief network, and not for topics outside the network. These results suggest a novel path for human-LLM belief alignment in work seeking to simulate and understand patterns of belief distributions in society.

agent, alignment, belief network, (14 more...)

2406.17232

Country:

Africa > Kenya (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Oregon (0.04)
(7 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Health & Medicine (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Ricci, Federica Zoe, Sudderth, Erik B., Lee, Jaylen, Peters, Megan A. K., Vannucci, Marina, Guindani, Michele

Bayesian temporal biclustering with applications to multi-subject neuroscience studies

We consider the problem of analyzing multivariate time series collected on multiple subjects, with the goal of identifying groups of subjects exhibiting similar trends in their recorded measurements over time as well as time-varying groups of associated measurements. To this end, we propose a Bayesian model for temporal biclustering featuring nested partitions, where a time-invariant partition of subjects induces a time-varying partition of measurements. Our approach allows for data-driven determination of the number of subject and measurement clusters as well as estimation of the number and location of changepoints in measurement partitions. To efficiently perform model fitting and posterior estimation with Markov Chain Monte Carlo, we derive a blocked update of measurements' cluster-assignment sequences. We illustrate the performance of our model in two applications to functional magnetic resonance imaging data and to an electroencephalogram dataset. The results indicate that the proposed model can combine information from potentially many subjects to discover a set of interpretable, dynamic patterns. Experiments on simulated data compare the estimation performance of the proposed model against ground-truth values and other statistical methods, showing that it performs well at identifying ground-truth subject and measurement clusters even when no subject or time dependence is present.

partition, probability, sequence, (17 more...)

2406.17131

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > California > Riverside County > Riverside (0.04)
North America > United States > California > Orange County > Irvine (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

van Meegen, Alexander, Sompolinsky, Haim

Coding schemes in neural networks learning classification tasks

arXiv.org Machine LearningJun-24-2024

Neural networks posses the crucial ability to generate meaningful representations of task-dependent features. Indeed, with appropriate scaling, supervised learning in neural networks can result in strong, task-dependent feature learning. However, the nature of the emergent representations, which we call the `coding scheme', is still unclear. To understand the emergent coding scheme, we investigate fully-connected, wide neural networks learning classification tasks using the Bayesian framework where learning shapes the posterior distribution of the network weights. Consistent with previous findings, our analysis of the feature learning regime (also known as `non-lazy', `rich', or `mean-field' regime) shows that the networks acquire strong, data-dependent features. Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity. In linear networks, an analog coding scheme of the task emerges. Despite the strong representations, the mean predictor is identical to the lazy case. In nonlinear networks, spontaneous symmetry breaking leads to either redundant or sparse coding schemes. Our findings highlight how network properties such as scaling of weights and neuronal nonlinearity can profoundly influence the emergent representations.

neuron, posterior, readout weight, (16 more...)

2406.16689

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Therapeutic Area (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningJun-24-2024

Greedy equivalence search for nonparametric graphical models

Aragam, Bryon

One of the hallmark achievements of the theory of graphical models and Bayesian model selection is the celebrated greedy equivalence search (GES) algorithm due to Chickering and Meek. GES is known to consistently estimate the structure of directed acyclic graph (DAG) models in various special cases including Gaussian and discrete models, which are in particular curved exponential families. A general theory that covers general nonparametric DAG models, however, is missing. Here, we establish the consistency of greedy equivalence search for general families of DAG models that satisfy smoothness conditions on the Markov factorization, and hence may not be curved exponential families, or even parametric. The proof leverages recent advances in nonparametric Bayes to construct a test for comparing misspecified DAG models that avoids arguments based on the Laplace approximation. Nonetheless, when the Laplace approximation is valid and a consistent scoring function exists, we recover the classical result. As a result, we obtain a general consistency theorem for GES applied to general DAG models.

algorithm, consistency, graphical model, (13 more...)

2406.17228

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.91)

arXiv.org Machine LearningJun-24-2024

Towards Bayesian Data Selection

Rodemann, Julian

A wide range of machine learning algorithms iteratively add data to the training sample. Examples include semi-supervised learning, active learning, multi-armed bandits, and Bayesian optimization. We embed this kind of data addition into decision theory by framing data selection as a decision problem. This paves the way for finding Bayes-optimal selections of data. For the illustrative case of self-training in semi-supervised learning, we derive the respective Bayes criterion. We further show that deploying this criterion mitigates the issue of confirmation bias by empirically assessing our method for generalized linear models, semi-parametric generalized additive models, and Bayesian neural networks on simulated and real-world data.

learning, selection, semi-supervised learning, (15 more...)

2406.1256

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Kirk, Paul D. W., Rao, Jackie

VICatMix: variational Bayesian clustering and variable selection for discrete biomedical data

arXiv.org Machine LearningJun-23-2024

Effective clustering of biomedical data is crucial in precision medicine, enabling accurate stratifiction of patients or samples. However, the growth in availability of high-dimensional categorical data, including `omics data, necessitates computationally efficient clustering algorithms. We present VICatMix, a variational Bayesian finite mixture model designed for the clustering of categorical data. The use of variational inference (VI) in its training allows the model to outperform competitors in term of efficiency, while maintaining high accuracy. VICatMix furthermore performs variable selection, enhancing its performance on high-dimensional, noisy data. The proposed model incorporates summarisation and model averaging to mitigate poor local optima in VI, allowing for improved estimation of the true number of clusters simultaneously with feature saliency. We demonstrate the performance of VICatMix with both simulated and real-world data, including applications to datasets from The Cancer Genome Atlas (TCGA), showing its use in cancer subtyping and driver gene discovery. We demonstrate VICatMix's utility in integrative cluster analysis with different `omics datasets, enabling the discovery of novel subtypes. \textbf{Availability:} VICatMix is freely available as an R package, incorporating C++ for faster computation, at \url{https://github.com/j-ackierao/VICatMix}.

issn, variable selection, vicatmix, (14 more...)

2406.16227

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Hematology (0.93)
Health & Medicine > Therapeutic Area > Oncology > Leukemia (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Elahi, Muhammad Qasim, Wei, Lai, Kocaoglu, Murat, Ghasemi, Mahsa

Adaptive Online Experimental Design for Causal Discovery

arXiv.org Artificial IntelligenceJun-22-2024

Causal discovery aims to uncover cause-and-effect relationships encoded in causal graphs by leveraging observational, interventional data, or their combination. The majority of existing causal discovery methods are developed assuming infinite interventional data. We focus on data interventional efficiency and formalize causal discovery from the perspective of online learning, inspired by pure exploration in bandit problems. A graph separating system, consisting of interventions that cut every edge of the graph at least once, is sufficient for learning causal graphs when infinite interventional data is available, even in the worst case. We propose a track-and-stop causal discovery algorithm that adaptively selects interventions from the graph separating system via allocation matching and learns the causal graph based on sampling history. Given any desired confidence value, the algorithm determines a termination condition and runs until it is met. We analyze the algorithm to establish a problem-dependent upper bound on the expected number of required interventional samples. Our proposed algorithm outperforms existing methods in simulations across various randomly generated causal graphs. It achieves higher accuracy, measured by the structural hamming distance (SHD) between the learned causal graph and the ground truth, with significantly fewer samples.

adaptive online experimental design, algorithm, intervention, (11 more...)

2405.11548

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Michigan (0.04)

Genre:

Research Report > Experimental Study (0.67)
Research Report > Strength High (0.46)

Industry: Education (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)