Plotting

 Ćiprijanović, Aleksandra


SIDDA: SInkhorn Dynamic Domain Adaptation for Image Classification with Equivariant Neural Networks

arXiv.org Artificial Intelligence

Modern neural networks (NNs) often do not generalize well in the presence of a "covariate shift"; that is, in situations where the training and test data distributions differ, but the conditional distribution of classification labels remains unchanged. In such cases, NN generalization can be reduced to a problem of learning more domain-invariant features. Domain adaptation (DA) methods include a range of techniques aimed at achieving this; however, these methods have struggled with the need for extensive hyperparameter tuning, which then incurs significant computational costs. In this work, we introduce SIDDA, an out-of-the-box DA training algorithm built upon the Sinkhorn divergence, that can achieve effective domain alignment with minimal hyperparameter tuning and computational overhead. We demonstrate the efficacy of our method on multiple simulated and real datasets of varying complexity, including simple shapes, handwritten digits, and real astronomical observations. SIDDA is compatible with a variety of NN architectures, and it works particularly well in improving classification accuracy and model calibration when paired with equivariant neural networks (ENNs). We find that SIDDA enhances the generalization capabilities of NNs, achieving up to a $\approx40\%$ improvement in classification accuracy on unlabeled target data. We also study the efficacy of DA on ENNs with respect to the varying group orders of the dihedral group $D_N$, and find that the model performance improves as the degree of equivariance increases. Finally, we find that SIDDA enhances model calibration on both source and target data--achieving over an order of magnitude improvement in the ECE and Brier score. SIDDA's versatility, combined with its automated approach to domain alignment, has the potential to advance multi-dataset studies by enabling the development of highly generalizable models.


Neural Network Prediction of Strong Lensing Systems with Domain Adaptation and Uncertainty Quantification

arXiv.org Artificial Intelligence

Modeling strong gravitational lenses is computationally expensive for the complex data from modern and next-generation cosmic surveys. Deep learning has emerged as a promising approach for finding lenses and predicting lensing parameters, such as the Einstein radius. Mean-variance Estimators (MVEs) are a common approach for obtaining aleatoric (data) uncertainties from a neural network prediction. However, neural networks have not been demonstrated to perform well on out-of-domain target data successfully -- e.g., when trained on simulated data and applied to real, observational data. In this work, we perform the first study of the efficacy of MVEs in combination with unsupervised domain adaptation (UDA) on strong lensing data. The source domain data is noiseless, and the target domain data has noise mimicking modern cosmology surveys. We find that adding UDA to MVE increases the accuracy on the target data by a factor of about two over an MVE model without UDA. Including UDA also permits much more well-calibrated aleatoric uncertainty predictions. Advancements in this approach may enable future applications of MVE models to real observational data.


DeepUQ: Assessing the Aleatoric Uncertainties from two Deep Learning Methods

arXiv.org Artificial Intelligence

Assessing the quality of aleatoric uncertainty estimates from uncertainty quantification (UQ) deep learning methods is important in scientific contexts, where uncertainty is physically meaningful and important to characterize and interpret exactly. We systematically compare aleatoric uncertainty measured by two UQ techniques, Deep Ensembles (DE) and Deep Evidential Regression (DER). Our method focuses on both zero-dimensional (0D) and two-dimensional (2D) data, to explore how the UQ methods function for different data dimensionalities. We investigate uncertainty injected on the input and output variables and include a method to propagate uncertainty in the case of input uncertainty so that we can compare the predicted aleatoric uncertainty to the known values. We experiment with three levels of noise. The aleatoric uncertainty predicted across all models and experiments scales with the injected noise level. However, the predicted uncertainty is miscalibrated to $\rm{std}(\sigma_{\rm al})$ with the true uncertainty for half of the DE experiments and almost all of the DER experiments. The predicted uncertainty is the least accurate for both UQ methods for the 2D input uncertainty experiment and the high-noise level. While these results do not apply to more complex data, they highlight that further research on post-facto calibration for these methods would be beneficial, particularly for high-noise and high-dimensional settings.


Domain-Adaptive Neural Posterior Estimation for Strong Gravitational Lens Analysis

arXiv.org Artificial Intelligence

Modeling strong gravitational lenses is prohibitively expensive for modern and next-generation cosmic survey data. Neural posterior estimation (NPE), a simulation-based inference (SBI) approach, has been studied as an avenue for efficient analysis of strong lensing data. However, NPE has not been demonstrated to perform well on out-of-domain target data -- e.g., when trained on simulated data and then applied to real, observational data. In this work, we perform the first study of the efficacy of NPE in combination with unsupervised domain adaptation (UDA). The source domain is noiseless, and the target domain has noise mimicking modern cosmology surveys. We find that combining UDA and NPE improves the accuracy of the inference by 1-2 orders of magnitude and significantly improves the posterior coverage over an NPE model without UDA. We anticipate that this combination of approaches will help enable future applications of NPE models to real observational data.


Domain Adaptive Graph Neural Networks for Constraining Cosmological Parameters Across Multiple Data Sets

arXiv.org Artificial Intelligence

Deep learning models have been shown to outperform methods that rely on summary statistics, like the power spectrum, in extracting information from complex cosmological data sets. However, due to differences in the subgrid physics implementation and numerical approximations across different simulation suites, models trained on data from one cosmological simulation show a drop in performance when tested on another. Similarly, models trained on any of the simulations would also likely experience a drop in performance when applied to observational data. Training on data from two different suites of the CAMELS hydrodynamic cosmological simulations, we examine the generalization capabilities of Domain Adaptive Graph Neural Networks (DA-GNNs). By utilizing GNNs, we capitalize on their capacity to capture structured scale-free cosmological information from galaxy distributions. Moreover, by including unsupervised domain adaptation via Maximum Mean Discrepancy (MMD), we enable our models to extract domain-invariant features. We demonstrate that DA-GNN achieves higher accuracy and robustness on cross-dataset tasks (up to $28\%$ better relative error and up to almost an order of magnitude better $\chi^2$). Using data visualizations, we show the effects of domain adaptation on proper latent space data alignment. This shows that DA-GNNs are a promising method for extracting domain-independent cosmological information, a vital step toward robust deep learning for real cosmic survey data.


Neural Inference of Gaussian Processes for Time Series Data of Quasars

arXiv.org Artificial Intelligence

The study of quasar light curves poses two problems: inference of the power spectrum and interpolation of an irregularly sampled time series. A baseline approach to these tasks is to interpolate a time series with a Damped Random Walk (DRW) model, in which the spectrum is inferred using Maximum Likelihood Estimation (MLE). However, the DRW model does not describe the smoothness of the time series, and MLE faces many problems in terms of optimization and numerical precision. In this work, we introduce a new stochastic model that we call $\textit{Convolved Damped Random Walk}$ (CDRW). This model introduces a concept of smoothness to a DRW, which enables it to describe quasar spectra completely. We also introduce a new method of inference of Gaussian process parameters, which we call $\textit{Neural Inference}$. This method uses the powers of state-of-the-art neural networks to improve the conventional MLE inference technique. In our experiments, the Neural Inference method results in significant improvement over the baseline MLE (RMSE: $0.318 \rightarrow 0.205$, $0.464 \rightarrow 0.444$). Moreover, the combination of both the CDRW model and Neural Inference significantly outperforms the baseline DRW and MLE in interpolating a typical quasar light curve ($\chi^2$: $0.333 \rightarrow 0.998$, $2.695 \rightarrow 0.981$). The code is published on GitHub.


DeepAdversaries: Examining the Robustness of Deep Learning Models for Galaxy Morphology Classification

arXiv.org Artificial Intelligence

Data processing and analysis pipelines in cosmological survey experiments introduce data perturbations that can significantly degrade the performance of deep learning-based models. Given the increased adoption of supervised deep learning methods for processing and analysis of cosmological survey data, the assessment of data perturbation effects and the development of methods that increase model robustness are increasingly important. In the context of morphological classification of galaxies, we study the effects of perturbations in imaging data. In particular, we examine the consequences of using neural networks when training on baseline data and testing on perturbed data. We consider perturbations associated with two primary sources: 1) increased observational noise as represented by higher levels of Poisson noise and 2) data processing noise incurred by steps such as image compression or telescope errors as represented by one-pixel adversarial attacks. We also test the efficacy of domain adaptation techniques in mitigating the perturbation-driven errors. We use classification accuracy, latent space visualizations, and latent space distance to assess model robustness. Without domain adaptation, we find that processing pixel-level errors easily flip the classification into an incorrect class and that higher observational noise makes the model trained on low-noise data unable to classify galaxy morphologies. On the other hand, we show that training with domain adaptation improves model robustness and mitigates the effects of these perturbations, improving the classification accuracy by 23% on data with higher observational noise. Domain adaptation also increases by a factor of ~2.3 the latent space distance between the baseline and the incorrectly classified one-pixel perturbed image, making the model more robust to inadvertent perturbations.