Dimensionality-Aware Outlier Detection: Theoretical and Experimental Analysis
Anderberg, Alastair, Bailey, James, Campello, Ricardo J. G. B., Houle, Michael E., Marques, Henrique O., Radovanović, Miloš, Zimek, Arthur
–arXiv.org Artificial Intelligence
We present a nonparametric method for outlier detection that takes full account of local variations in intrinsic dimensionality within the dataset. Using the theory of Local Intrinsic Dimensionality (LID), our 'dimensionality-aware' outlier detection method, DAO, is derived as an estimator of an asymptotic local expected density ratio involving the query point and a close neighbor drawn at random. The dimensionality-aware behavior of DAO is due to its use of local estimation of LID values in a theoretically-justified way. Through comprehensive experimentation on more than 800 synthetic and real datasets, we show that DAO significantly outperforms three popular and important benchmark outlier detection methods: Local Outlier Factor (LOF), Simplified LOF, and kNN.
arXiv.org Artificial Intelligence
Jan-9-2024
- Country:
- Europe
- Denmark > Southern Denmark (0.04)
- Serbia > Vojvodina
- South Bačka District > Novi Sad (0.04)
- North America > United States
- New Jersey > Essex County > Newark (0.04)
- Oceania > Australia
- New South Wales > Callaghan (0.04)
- Europe
- Genre:
- Research Report > New Finding (0.46)
- Technology: