Active Anomaly Detection for time-domain discoveries

Ishida, Emille E. O., Kornilov, Matwey V., Malanchev, Konstantin L., Pruzhinskaya, Maria V., Volnova, Alina A., Korolev, Vladimir S., Mondon, Florian, Sreejith, Sreevarsha, Malancheva, Anastasia, Das, Shubhomoy

arXiv.org Machine Learning 

We present the first application of adaptive machine learning to the identification of anomalies in a data set of non-periodic astronomical light curves. The method follows an active learning strategy where highly informative objects are selected to be labelled. This new information is subsequently used to improve the machine learning model, allowing its accuracy to evolve with the addition of every new classification. For the case of anomaly detection, the algorithm aims to maximize the number of real anomalies presented to the expert by slightly modifying the decision boundary of a traditional isolation forest in each iteration. As a proof of concept, we apply the Active Anomaly Discovery (AAD) algorithm to light curves from the Open Supernova Catalog and compare its results to those of a static Isolation Forest (IF). For both methods, we visually inspected objects within 2% highest anomaly scores. We show that AAD was able to identify 80% more true anomalies than IF. This result is the first evidence that AAD algorithms can play a central role in the search for new physics in the era of large scale sky surveys.