Not enough data to create a plot.
Try a different view from the menu above.
Contardo, Gabriella
Detecting Localized Density Anomalies in Multivariate Data via Coin-Flip Statistics
Springer, Sebastian, Scaffidi, Andre, Autenrieth, Maximilian, Contardo, Gabriella, Laio, Alessandro, Trotta, Roberto, Haario, Heikki
Detecting localized density differences in multivariate data is a crucial task in computational science. Such anomalies can indicate a critical system failure, lead to a groundbreaking scientific discovery, or reveal unexpected changes in data distribution. We introduce EagleEye, an anomaly detection method to compare two multivariate datasets with the aim of identifying local density anomalies, namely over- or under-densities affecting only localised regions of the feature space. Anomalies are detected by modelling, for each point, the ordered sequence of its neighbours' membership label as a coin-flipping process and monitoring deviations from the expected behaviour of such process. A unique advantage of our method is its ability to provide an accurate, entirely unsupervised estimate of the local signal purity. We demonstrate its effectiveness through experiments on both synthetic and real-world datasets. In synthetic data, EagleEye accurately detects anomalies in multiple dimensions even when they affect a tiny fraction of the data. When applied to a challenging resonant anomaly detection benchmark task in simulated Large Hadron Collider data, EagleEye successfully identifies particle decay events present in just 0.3% of the dataset. In global temperature data, EagleEye uncovers previously unidentified, geographically localised changes in temperature fields that occurred in the most recent years. Thanks to its key advantages of conceptual simplicity, computational efficiency, trivial parallelisation, and scalability, EagleEye is widely applicable across many fields.
Cosmology with Persistent Homology: Parameter Inference via Machine Learning
Calles, Juan, Yip, Jacky H. T., Contardo, Gabriella, Noreรฑa, Jorge, Rouhiainen, Adam, Shiu, Gary
Building upon [2308.02636], this article investigates the potential constraining power of persistent homology for cosmological parameters and primordial non-Gaussianity amplitudes in a likelihood-free inference pipeline. We evaluate the ability of persistence images (PIs) to infer parameters, compared to the combined Power Spectrum and Bispectrum (PS/BS), and we compare two types of models: neural-based, and tree-based. PIs consistently lead to better predictions compared to the combined PS/BS when the parameters can be constrained (i.e., for $\{\Omega_{\rm m}, \sigma_8, n_{\rm s}, f_{\rm NL}^{\rm loc}\}$). PIs perform particularly well for $f_{\rm NL}^{\rm loc}$, showing the promise of persistent homology in constraining primordial non-Gaussianity. Our results show that combining PIs with PS/BS provides only marginal gains, indicating that the PS/BS contains little extra or complementary information to the PIs. Finally, we provide a visualization of the most important topological features for $f_{\rm NL}^{\rm loc}$ and for $\Omega_{\rm m}$. This reveals that clusters and voids (0-cycles and 2-cycles) are most informative for $\Omega_{\rm m}$, while $f_{\rm NL}^{\rm loc}$ uses the filaments (1-cycles) in addition to the other two types of topological features.
Meta-Learning for Anomaly Classification with Set Equivariant Networks: Application in the Milky Way
Oladosu, Ademola, Xu, Tony, Ekfeldt, Philip, Kelly, Brian A., Cranmer, Miles, Ho, Shirley, Price-Whelan, Adrian M., Contardo, Gabriella
We present a new meta-learning approach for supervised anomaly classification / one-class classification using set equivariant networks. We focus our experiments on an astronomy application. Our problem setting is composed of a set of classification tasks. Each task has a (small) set of positive, labeled examples and a larger set of unlabeled examples. We expect the positive instances to be much more uncommon (i.e. 'anomalies') than the negative ones ('normal' class). We propose a novel use of equivariant networks for this setting. Specifically we use Deep Sets, which was developed for point-clouds and unordered sets and is equivariant to permutation. We propose to consider the set of positive examples of a given task as a 'point-cloud'. The key idea is that the network directly takes as input the set of positive examples in addition to the current example to classify. This allows the model to predict at test-time on new tasks using only positive labeled examples (i.e 'One-Class classification' setting) by design, potentially without retraining. However, the model is trained in a meta-learning regime on a dataset of several tasks with full-supervision (positive and negative labels). This setup is motivated by our target application on stellar streams. Streams are groups of stars sharing specific properties in various features. For a detected stream, we can determine a set of stars that likely belong to the stream. We aim to characterize the membership of all other nearby stars. We build a meta-dataset of simulated streams injected onto real data and evaluate on unseen synthetic streams and one known stream. Our experiments show encouraging results to explore furthermore equivariant networks for anomaly or 'one-class' classification in a meta-learning regime.