joint analysis
$\mathtt{emuflow}$: Normalising Flows for Joint Cosmological Analysis
Mootoovaloo, Arrykrishna, García-García, Carlos, Alonso, David, Ruiz-Zapatero, Jaime
Given the growth in the variety and precision of astronomical datasets of interest for cosmology, the best cosmological constraints are invariably obtained by combining data from different experiments. At the likelihood level, one complication in doing so is the need to marginalise over large-dimensional parameter models describing the data of each experiment. These include both the relatively small number of cosmological parameters of interest and a large number of "nuisance" parameters. Sampling over the joint parameter space for multiple experiments can thus become a very computationally expensive operation. This can be significantly simplified if one could sample directly from the marginal cosmological posterior distribution of preceding experiments, depending only on the common set of cosmological parameters. In this paper, we show that this can be achieved by emulating marginal posterior distributions via normalising flows. The resulting trained normalising flow models can be used to efficiently combine cosmological constraints from independent datasets without increasing the dimensionality of the parameter space under study. We show that the method is able to accurately describe the posterior distribution of real cosmological datasets, as well as the joint distribution of different datasets, even when significant tension exists between experiments. The resulting joint constraints can be obtained in a fraction of the time it would take to combine the same datasets at the level of their likelihoods. We construct normalising flow models for a set of public cosmological datasets of general interests and make them available, together with the software used to train them, and to exploit them in cosmological parameter inference.
Joint Analysis of Single-Cell Data across Cohorts with Missing Modalities
Arriola, Marianne, Pan, Weishen, Zhou, Manqi, Zhang, Qiannan, Su, Chang, Wang, Fei
Joint analysis of multi-omic single-cell data across cohorts has significantly enhanced the comprehensive analysis of cellular processes. However, most of the existing approaches for this purpose require access to samples with complete modality availability, which is impractical in many real-world scenarios. In this paper, we propose (Single-Cell Cross-Cohort Cross-Category) integration, a novel framework that learns unified cell representations under domain shift without requiring full-modality reference samples. Our generative approach learns rich cross-modal and cross-domain relationships that enable imputation of these missing modalities. Through experiments on real-world multi-omic datasets, we demonstrate that offers a robust solution to single-cell tasks such as cell type clustering, cell type classification, and feature imputation.
Joint Analysis of Time-Evolving Binary Matrices and Associated Documents
We consider problems for which one has incomplete binary matrices that evolve with time (e.g., the votes of legislators on particular legislation, with each year characterized by a different such matrix). An objective of such analysis is to infer structure and inter-relationships underlying the matrices, here defined by latent features associated with each axis of the matrix. In addition, it is assumed that documents are available for the entities associated with at least one of the matrix axes. By jointly analyzing the matrices and documents, one may be used to inform the other within the analysis, and the model offers the opportunity to predict matrix values (e.g., votes) based only on an associated document (e.g., legislation). The research presented here merges two areas of machine-learning that have previously been investigated separately: incomplete-matrix analysis and topic modeling.
Modelling stellar activity with Gaussian process regression networks
Camacho, J. D., Faria, J. P., Viana, P. T. P.
Stellar photospheric activity is known to limit the detection and characterisation of extra-solar planets. In particular, the study of Earth-like planets around Sun-like stars requires data analysis methods that can accurately model the stellar activity phenomena affecting radial velocity (RV) measurements. Gaussian Process Regression Networks (GPRNs) offer a principled approach to the analysis of simultaneous time-series, combining the structural properties of Bayesian neural networks with the non-parametric flexibility of Gaussian Processes. Using HARPS-N solar spectroscopic observations encompassing three years, we demonstrate that this framework is capable of jointly modelling RV data and traditional stellar activity indicators. Although we consider only the simplest GPRN configuration, we are able to describe the behaviour of solar RV data at least as accurately as previously published methods. We confirm the correlation between the RV and stellar activity time series reaches a maximum at separations of a few days, and find evidence of non-stationary behaviour in the time series, associated with an approaching solar activity minimum.
Joint Analysis of Time-Evolving Binary Matrices and Associated Documents
Wang, Eric, Liu, Dehong, Silva, Jorge, Carin, Lawrence, Dunson, David B.
We consider problems for which one has incomplete binary matrices that evolve with time (e.g., the votes of legislators on particular legislation, with each year characterized by a different such matrix). An objective of such analysis is to infer structure and inter-relationships underlying the matrices, here defined by latent features associated with each axis of the matrix. In addition, it is assumed that documents are available for the entities associated with at least one of the matrix axes. By jointly analyzing the matrices and documents, one may be used to inform the other within the analysis, and the model offers the opportunity to predict matrix values (e.g., votes) based only on an associated document (e.g., legislation). The research presented here merges two areas of machine-learning that have previously been investigated separately: incomplete-matrix analysis and topic modeling.
Joint analysis of clinical risk factors and 4D cardiac motion for survival prediction using a hybrid deep learning network
Jin, Shihao, Savioli, Nicolò, de Marvao, Antonio, Dawes, Timothy JW, Gandy, Axel, Rueckert, Daniel, O'Regan, Declan P
In this work, a novel approach is proposed for joint analysis of high dimensional time-resolved cardiac motion features obtained from segmented cardiac MRI and low dimensional clinical risk factors to improve survival prediction in heart failure. Different methods are evaluated to find the optimal way to insert conventional covariates into deep prediction networks. Correlation analysis between autoencoder latent codes and covariate features is used to examine how these predictors interact. We believe that similar approaches could also be used to introduce knowledge of genetic variants to such survival networks to improve outcome prediction by jointly analysing cardiac motion traits with inheritable risk factors.
Tensor-Based Fusion of EEG and FMRI to Understand Neurological Changes in Schizophrenia
Acar, Evrim, Levin-Schwartz, Yuri, Calhoun, Vince D., Adalı, Tülay
Neuroimaging modalities such as functional magnetic resonance imaging (fMRI) and electroencephalography (EEG) provide information about neurological functions in complementary spatiotemporal resolutions; therefore, fusion of these modalities is expected to provide better understanding of brain activity. In this paper, we jointly analyze fMRI and multi-channel EEG signals collected during an auditory oddball task with the goal of capturing brain activity patterns that differ between patients with schizophrenia and healthy controls. Rather than selecting a single electrode or matricizing the third-order tensor that can be naturally used to represent multi-channel EEG signals, we preserve the multi-way structure of EEG data and use a coupled matrix and tensor factorization (CMTF) model to jointly analyze fMRI and EEG signals. Our analysis reveals that (i) joint analysis of EEG and fMRI using a CMTF model can capture meaningful temporal and spatial signatures of patterns that behave differently in patients and controls, and (ii) these differences and the interpretability of the associated components increase by including multiple electrodes from frontal, motor and parietal areas, but not necessarily by including all electrodes in the analysis.
Joint Analysis of Time-Evolving Binary Matrices and Associated Documents
Wang, Eric, Liu, Dehong, Silva, Jorge, Carin, Lawrence, Dunson, David B.
We consider problems for which one has incomplete binary matrices that evolve with time (e.g., the votes of legislators on particular legislation, with each year characterized by a different such matrix). An objective of such analysis is to infer structure and inter-relationships underlying the matrices, here defined by latent features associated with each axis of the matrix. In addition, it is assumed that documents are available for the entities associated with at least one of the matrix axes. By jointly analyzing the matrices and documents, one may be used to inform the other within the analysis, and the model offers the opportunity to predict matrix values (e.g., votes) based only on an associated document (e.g., legislation). The research presented here merges two areas of machine-learning that have previously been investigated separately: incomplete-matrix analysis and topic modeling. The analysis is performed from a Bayesian perspective, with efficient inference constituted via Gibbs sampling. The framework is demonstrated by considering all voting data and available documents (legislation) during the 220-year lifetime of the United States Senate and House of Representatives.