KACDP: A Highly Interpretable Credit Default Prediction Model

Liu, Kun, Zhao, Jin

arXiv.org Artificial Intelligence 

In today's financial field, individual credit risk prediction has become a crucial part in the risk management of financial institutions. Accurate default prediction can not only help financial institutions significantly reduce losses but also significantly improve the utilization rate of funds, thereby enhancing their competitiveness in the market [1] [2]. With the rapid development of financial technology, numerous machine learning and deep learning techniques are gradually being widely applied in credit risk assessment. However, the existing various methods inevitably expose certain limitations when dealing with high-dimensional and nonlinear data, among which the problems of interpretability and transparency are the most prominent [3]. Traditional credit risk prediction methods mainly include two categories: statistical models and machine learning models. The typical representative of statistical models, such as Logistic regression [4], has the advantage of being simple and easy to use. However, when dealing with complex data, due to relatively strict assumptions, it is often difficult to effectively capture nonlinear relationships. Machine learning models, such as Random Forest (RF) [5], Support Vector Machine (SVM) [6], and Extreme Gradient Boosting Machine (XGBoost) [7], although they perform relatively well in handling high-dimensional data, their interpretability is relatively poor and it is difficult to provide a clear and transparent decision-making process. Deep learning models, like Multi-Layer Perceptron (MLP) [8] and Recurrent Neural Network (RNN) [9], although they have strong expressive ability, in the practical application in the financial field, their black-box characteristics cause the model to severely lack transparency and interpretability, which undoubtedly becomes a major problem in the strictly regulated financial industry [10].