Randomized PCA Forest for Outlier Detection
Rajabinasab, Muhammad, Pakdaman, Farhad, Gabbouj, Moncef, Schneider-Kamp, Peter, Zimek, Arthur
--We propose a novel unsupervised outlier detection method based on Randomized Principal Component Analysis (PCA). Inspired by the performance of Randomized PCA (RPCA) Forest in approximate K-Nearest Neighbor (KNN) search, we develop a novel unsupervised outlier detection method that utilizes RPCA Forest for outlier detection. Experimental results showcase the superiority of the proposed approach compared to the classical and state-of-the-art methods in performing the outlier detection task on several datasets while performing competitively on the rest. The extensive analysis of the proposed method reflects it high generalization power and its computational efficiency, highlighting it as a good choice for unsupervised outlier detection. An outlier, as defined by Hawkins [18], is "an observation which deviates so much from other observations as to arouse suspicions that it was generated by a different mechanism." Similarly, Barnett and Lewis [3] describe it as "an observation (or subset of observations) which appears to be inconsistent with the remainder of that set of data." Outlier detection is the process of identifying such outliers, i.e., the data points which differ from the rest of the data. It is one of the most important and fundamental tasks in data mining and machine learning with applications in intrusion detection [20], fault detection [37], fraud detection [7] and others [11], [13], [27]. In recent years, many methods have been proposed to carry out the outlier detection task [1], [9], [10], [23], [42]. Despite the demonstration of promising results, further studies show that these results might be limited only to specific instances of the problem (e.g., a limited selection of datasets, a specific kind of outliers, etc.) [6].
Aug-25-2025
- Country:
- Europe
- Denmark > Southern Denmark (0.04)
- Finland (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- North America > United States
- Wisconsin (0.04)
- Europe
- Genre:
- Research Report
- Experimental Study (0.67)
- New Finding (0.93)
- Research Report
- Industry:
- Health & Medicine > Therapeutic Area (0.94)
- Law Enforcement & Public Safety (0.68)
- Technology: