lnp
Interpretable Generative and Discriminative Learning for Multimodal and Incomplete Clinical Data
Belenguer-Llorens, Albert, Sevilla-Salcedo, Carlos, Mourao-Miranda, Janaina, Gómez-Verdejo, Vanessa
Real-world clinical problems are often characterized by multimodal data, usually associated with incomplete views and limited sample sizes in their cohorts, posing significant limitations for machine learning algorithms. In this work, we propose a Bayesian approach designed to efficiently handle these challenges while providing interpretable solutions. Our approach integrates (1) a generative formulation to capture cross-view relationships with a semi-supervised strategy, and (2) a discriminative task-oriented formulation to identify relevant information for specific downstream objectives. This dual generative-discriminative formulation offers both general understanding and task-specific insights; thus, it provides an automatic imputation of the missing views while enabling robust inference across different data sources. The potential of this approach becomes evident when applied to the multimodal clinical data, where our algorithm is able to capture and disentangle the complex interactions among biological, psychological, and sociodemographic modalities.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Spain > Galicia > Madrid (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)
Incorporating Expert Knowledge into Bayesian Causal Discovery of Mixtures of Directed Acyclic Graphs
Björkman, Zachris, Loría, Jorge, Wharrie, Sophie, Kaski, Samuel
Bayesian causal discovery benefits from prior information elicited from domain experts, and in heterogeneous domains any prior knowledge would be badly needed. However, so far prior elicitation approaches have assumed a single causal graph and hence are not suited to heterogeneous domains. We propose a causal elicitation strategy for heterogeneous settings, based on Bayesian experimental design (BED) principles, and a variational mixture structure learning (VaMSL) method -- extending the earlier differentiable Bayesian structure learning (DiBS) method -- to iteratively infer mixtures of causal Bayesian networks (CBNs). We construct an informative graph prior incorporating elicited expert feedback in the inference of mixtures of CBNs. Our proposed method successfully produces a set of alternative causal models (mixture components or clusters), and achieves an improved structure learning performance on heterogeneous synthetic data when informed by a simulated expert. Finally, we demonstrate that our approach is capable of capturing complex distributions in a breast cancer database.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Finland (0.04)
- North America > United States > Wisconsin (0.04)
- (3 more...)
A Concise Mathematical Description of Active Inference in Discrete Time
van Oostrum, Jesse, Langer, Carlotta, Ay, Nihat
Active inference is a theory that describes the behavior (action selection mechanism) of an agent in an environment. We aim to present a concise mathematical description of the theory so that a reader interested in the mathematical details can quickly find what they are looking for. We have paid special attention to choosing notation that is more in line with standard mathematical texts and is also descriptive, in the sense that dependencies are made explicit. The aim of this paper is not to justify the theory or convince the reader that this is right theory. The paper is divided into a main text and an appendix. The main text aims to present a clear and simple picture of active inference in discrete time that is accessible for people new to the topic. It is further subdivided into an inference part, which assumes the existence of a generative model, a learning part, in which we discuss how the agent can learn this model, and an example, illustrating the action selection mechanism. In the appendix the more subtle details and derivations are discussed. This part is aimed at people who have already studied the active inference literature but struggle to make sense of the mathematical details.
Data efficiency, dimensionality reduction, and the generalized symmetric information bottleneck
Martini, K. Michael, Nemenman, Ilya
The Symmetric Information Bottleneck (SIB), an extension of the more familiar Information Bottleneck, is a dimensionality reduction technique that simultaneously compresses two random variables to preserve information between their compressed versions. We introduce the Generalized Symmetric Information Bottleneck (GSIB), which explores different functional forms of the cost of such simultaneous reduction. We then explore the dataset size requirements of such simultaneous compression. We do this by deriving bounds and root-mean-squared estimates of statistical fluctuations of the involved loss functions. We show that, in typical situations, the simultaneous GSIB compression requires qualitatively less data to achieve the same errors compared to compressing variables one at a time. We suggest that this is an example of a more general principle that simultaneous compression is more data efficient than independent compression of each of the input variables.
- North America > United States > Georgia > Fulton County > Atlanta (0.14)
- North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
- Health & Medicine > Therapeutic Area > Neurology (0.93)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Fumbling in Babel: An Investigation into ChatGPT's Language Identification Ability
Chen, Wei-Rui, Adebara, Ife, Doan, Khai Duy, Liao, Qisheng, Abdul-Mageed, Muhammad
Recently, ChatGPT has emerged as a powerful NLP tool that can carry out several tasks. However, the range of languages ChatGPT can handle remains largely a mystery. In this work, we investigate ChatGPT's language identification abilities. For this purpose, we compile Babel-670, a benchmark comprising $670$ languages representing $23$ language families. Languages in Babel-670 run the gamut between the very high-resource to the very low-resource and are spoken in five continents. We then study ChatGPT's (both GPT-3.5 and GPT-4) ability to (i) identify both language names and language codes (ii) under both zero- and few-shot conditions (iii) with and without provision of label set. When compared to smaller finetuned language identification tools, we find that ChatGPT lags behind. Our empirical analysis shows the reality that ChatGPT still resides in a state of potential enhancement before it can sufficiently serve diverse communities.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- North America > Canada > British Columbia (0.04)
- (7 more...)
From Graph Generation to Graph Classification
The graph classification task is to assign a discrete class label to an input graph. The dominant approach for neural graph classification is to compute an embedding for the input graph and perform the final classification in embedding space. The successful graph coarsening approach aggregates graph structural information at successively lower resolutions until a final embedding is obtained. Another direction for graph learning, so far unrelated, is graph generation. A graph generative model (GGM) aims to generate realistic graphs, often by sampling from a distribution over graphs. GGMs include the graph Variational Auto-Encoder (GVAE), auto-regressive methods, and most recently graph diffusion models.
Convergence for score-based generative modeling with polynomial complexity
Lee, Holden, Lu, Jianfeng, Tan, Yixin
Score-based generative modeling (SGM) is a highly successful approach for learning a probability distribution from data and generating further samples. We prove the first polynomial convergence guarantees for the core mechanic behind SGM: drawing samples from a probability density $p$ given a score estimate (an estimate of $\nabla \ln p$) that is accurate in $L^2(p)$. Compared to previous works, we do not incur error that grows exponentially in time or that suffers from a curse of dimensionality. Our guarantee works for any smooth distribution and depends polynomially on its log-Sobolev constant. Using our guarantee, we give a theoretical analysis of score-based generative modeling, which transforms white-noise input into samples from a learned data distribution given score estimates at different noise scales. Our analysis gives theoretical grounding to the observation that an annealed procedure is required in practice to generate good samples, as our proof depends essentially on using annealing to obtain a warm start at each step. Moreover, we show that a predictor-corrector algorithm gives better convergence than using either portion alone.
- Asia > Middle East > Jordan (0.04)
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
- Workflow (0.48)
- Research Report (0.40)
Accelerating Stochastic Probabilistic Inference
Recently, Stochastic Variational Inference (SVI) has been increasingly attractive thanks to its ability to find good posterior approximations of probabilistic models. It optimizes the variational objective with stochastic optimization, following noisy estimates of the natural gradient. However, almost all the state-of-the-art SVI algorithms are based on first-order optimization algorithm and often suffer from poor convergence rate. In this paper, we bridge the gap between second-order methods and stochastic variational inference by proposing a second-order based stochastic variational inference approach. In particular, firstly we derive the Hessian matrix of the variational objective. Then we devise two numerical schemes to implement second-order SVI efficiently. Thorough empirical evaluations are investigated on both synthetic and real dataset to backup both the effectiveness and efficiency of the proposed approach.