Shapley Values of Reconstruction Errors of PCA for Explaining Anomaly Detection

Takeishi, Naoya

arXiv.org Machine Learning 

--We present a method to compute the Shapley values of reconstruction errors of principal component analysis (PCA), which is particularly useful in explaining the results of anomaly detection based on PCA. Because features are usually correlated when PCA-based anomaly detection is applied, care must be taken in computing a value function for the Shapley values. We utilize the probabilistic view of PCA, particularly its conditional distribution, to exactly compute a value function for the Shapely values. We also present numerical examples, which imply that the Shapley values are advantageous for explaining detected anomalies than raw reconstruction errors of each feature. Anomaly detection based on machine learning has been actively studied and now plays an important role in various industrial applications such as fraud detection in finance [1], intrusion detection [2], and fault detection of mechanical systems [3]. Up to date, there have been proposed many types of anomaly detection algorithms based on different assumptions and technical principles (see, e.g., [4]-[6]).