Goto

Collaborating Authors

 Cerino, Franco


Building Machine Learning Challenges for Anomaly Detection in Science

arXiv.org Artificial Intelligence

Scientific discoveries are often made by finding a pattern or object that was not predicted by the known rules of science. Oftentimes, these anomalous events or objects that do not conform to the norms are an indication that the rules of science governing the data are incomplete, and something new needs to be present to explain these unexpected outliers. The challenge of finding anomalies can be confounding since it requires codifying a complete knowledge of the known scientific behaviors and then projecting these known behaviors on the data to look for deviations. When utilizing machine learning, this presents a particular challenge since we require that the model not only understands scientific data perfectly but also recognizes when the data is inconsistent and out of the scope of its trained behavior. In this paper, we present three datasets aimed at developing machine learning-based anomaly detection for disparate scientific domains covering astrophysics, genomics, and polar science. We present the different datasets along with a scheme to make machine learning challenges around the three datasets findable, accessible, interoperable, and reusable (FAIR). Furthermore, we present an approach that generalizes to future machine learning challenges, enabling the possibility of large, more compute-intensive challenges that can ultimately lead to scientific discovery.


Hyperparameter optimization of hp-greedy reduced basis for gravitational wave surrogates

arXiv.org Artificial Intelligence

In a previous work we introduced, in the context of gravitational wave science, an initial study on an automated domain-decomposition approach for reduced basis through hp-greedy refinement. The approach constructs local reduced bases of lower dimensionality than global ones, with the same or higher accuracy. These ``light'' local bases should imply both faster evaluations when predicting new waveforms and faster data analysis, in particular faster statistical inference (the forward and inverse problems, respectively). In this approach, however, we have previously found important dependence on several hyperparameters, which do not appear in global reduced basis. This naturally leads to the problem of hyperparameter optimization (HPO), which is the subject of this paper. We tackle the problem through a Bayesian optimization, and show its superiority when compared to grid or random searches. We find that for gravitational waves from the collision of two spinning but non-precessing black holes, for the same accuracy, local hp-greedy reduced bases with HPO have a lower dimensionality of up to $4 \times$ for the cases here studied, depending on the desired accuracy. This factor should directly translate in a parameter estimation speedup, for instance. Such acceleration might help in the near real-time requirements for electromagnetic counterparts of gravitational waves from compact binary coalescences. In addition, we find that the Bayesian approach used in this paper for HPO is two orders of magnitude faster than, for example, a grid search, with about a $100 \times$ acceleration. The code developed for this project is available as open source from public repositories.


An automated parameter domain decomposition approach for gravitational wave surrogates using hp-greedy refinement

arXiv.org Artificial Intelligence

We introduce hp-greedy, a refinement approach for building gravitational wave surrogates as an extension of the standard reduced basis framework. Our proposal is data-driven, with a domain decomposition of the parameter space, local reduced basis, and a binary tree as the resulting structure, which are obtained in an automated way. When compared to the standard global reduced basis approach, the numerical simulations of our proposal show three salient features: i) representations of lower dimension with no loss of accuracy, ii) a significantly higher accuracy for a fixed maximum dimensionality of the basis, in some cases by orders of magnitude, and iii) results that depend on the reduced basis seed choice used by the refinement algorithm. We first illustrate the key parts of our approach with a toy model and then present a more realistic use case of gravitational waves emitted by the collision of two spinning, non-precessing black holes. We discuss performance aspects of hp-greedy, such as overfitting with respect to the depth of the tree structure, and other hyperparameter dependences. As two direct applications of the proposed hp-greedy refinement, we envision: i) a further acceleration of statistical inference, which might be complementary to focused reduced-order quadratures, and ii) the search of gravitational waves through clustering and nearest neighbors.