Ishida, Emille E. O., Kornilov, Matwey V., Malanchev, Konstantin L., Pruzhinskaya, Maria V., Volnova, Alina A., Korolev, Vladimir S., Mondon, Florian, Sreejith, Sreevarsha, Malancheva, Anastasia, Das, Shubhomoy
We present the first application of adaptive machine learning to the identification of anomalies in a data set of non-periodic astronomical light curves. The method follows an active learning strategy where highly informative objects are selected to be labelled. This new information is subsequently used to improve the machine learning model, allowing its accuracy to evolve with the addition of every new classification. For the case of anomaly detection, the algorithm aims to maximize the number of real anomalies presented to the expert by slightly modifying the decision boundary of a traditional isolation forest in each iteration. As a proof of concept, we apply the Active Anomaly Discovery (AAD) algorithm to light curves from the Open Supernova Catalog and compare its results to those of a static Isolation Forest (IF). For both methods, we visually inspected objects within 2% highest anomaly scores. We show that AAD was able to identify 80% more true anomalies than IF. This result is the first evidence that AAD algorithms can play a central role in the search for new physics in the era of large scale sky surveys.
Future Connected and Automated Vehicles (CAV), and more generally ITS, will form a highly interconnected system. Such a paradigm is referred to as the Internet of Vehicles (herein Internet of CAVs) and is a prerequisite to orchestrate traffic flows in cities. For optimal decision making and supervision, traffic centres will have access to suitably anonymized CAV mobility information. Safe and secure operations will then be contingent on early detection of anomalies. In this paper, a novel unsupervised learning model based on deep autoencoder is proposed to detect the self-reported location anomaly in CAVs, using vehicle locations and the Received Signal Strength Indicator (RSSI) as features. Quantitative experiments on simulation datasets show that the proposed approach is effective and robust in detecting self-reported location anomalies.
Some of the popular anomaly detection techniques are Density-based techniques (k-nearest neighbor,local outlier factor,Subspace and correlation-based, outlier detection, One class support vector machines, Replicator neural networks, Cluster analysis-based outlier detection, Deviations from association rules and frequent itemsets, Fuzzy logic based outlier detection and Ensemble techniques. RapidMiner provides an integrated environment for machine learning, data mining, text mining, predictive analytics and business analytics. Scikit-learn is an open source machine learning library for the Python programming language.It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy. You may also live to read, Top Business Intelligence companies, Open Source and Free Business Intelligence Solutions, Cloud – SaaS – OnDemand Business Intelligence Solutions, Top Free Extract, Transform, and Load, ETL Software, Freemium Cloud Business Intelligence Solutions, Top Embedded Analytics Business Intelligence Software, Top Dashboard Software, and Top Data Visualization Software.
Most of these methods are designed to identify examples that are individually anomalous, i.e. Pr(example normal) is vanishingly small. An area of anomaly detection that has received comparatively less attention is the case where one cannot determine with certainty that any single example is anomalous. These "group anomalies" instead manifest as overdensities in the probability density of the data and occur naturally in a variety of applications ranging from computer security to pandemic detection and include many other scientific, industrial, and financial applications. Group anomaly detection is also central to fundamental physics. In particular, most data analyses at the Large Hadron Collider (LHC) can be seen as group anomaly detection. Of these approaches, nearly all of them are supervised and rely strongly on a particular anomaly hypothesis. Following the Nobel-prize winning discovery of the Higgs boson in 2012 (which was a benchmark for group anomaly detection ), there is increasing urgency in searching for new phenomena beyond the Standard Model (BSM) of particle physics, for which there is ample indirect evidence (e.g.
An important task in exploring and analyzing real-world data sets is to detect unusual and interesting phenomena. In this paper, we study the group anomaly detection problem. Unlike traditional anomaly detection research that focuses on data points, our goal is to discover anomalous aggregated behaviors of groups of points. For this purpose, we propose the Flexible Genre Model (FGM). FGM is designed to characterize data groups at both the point level and the group level so as to detect various types of group anomalies.