In the area of credit risk analytics, current Bankruptcy Prediction Models (BPMs) struggle with (a) the availability of comprehensive and real-world data sets and (b) the presence of extreme class imbalance in the data (i.e., very few samples for the minority class) that degrades the performance of the prediction model. Moreover, little research has compared the relative performance of well-known BPM's on public datasets addressing the class imbalance problem. In this work, we apply eight classes of well-known BPMs, as suggested by a review of decades of literature, on a new public dataset named Freddie Mac Single-Family Loan-Level Dataset with resampling (i.e., adding synthetic minority samples) of the minority class to tackle class imbalance. Additionally, we apply some recent AI techniques (e.g., tree-based ensemble techniques) that demonstrate potentially better results on models trained with resampled data. In addition, from the analysis of 19 years (1999-2017) of data, we discover that models behave differently when presented with sudden changes in the economy (e.g., a global financial crisis) resulting in abrupt fluctuations in the national default rate. In summary, this study should aid practitioners/researchers in determining the appropriate model with respect to data that contains a class imbalance and various economic stages.
Online Peer to Peer Lending (P2PL) systems connect lenders and borrowers directly, thereby making it convenient to borrow and lend money without intermediaries such as banks. Many recommendation systems have been developed for lenders to achieve higher interest rates and avoid defaulting loans. However, there has not been much research in developing recommendation systems to help borrowers make wise decisions. On P2PL platforms, borrowers can either apply for bidding loans, where the interest rate is determined by lenders bidding on a loan or traditional loans where the P2PL platform determines the interest rate. Different borrower grades -- determining the credit worthiness of borrowers get different interest rates via these two mechanisms. Hence, it is essential to determine which type of loans borrowers should apply for. In this paper, we build a recommendation system that recommends to any new borrower the type of loan they should apply for. Using our recommendation system, any borrower can achieve lowered interest rates with a higher likelihood of getting funded.
This work provides a review of this literature. The motivation for this summary arose from our companion paper Ruf and W ang . There we continue th e discussions of this note; in particular, of potentially problematic data leakage when training ANNs to historic financial data. This paper is organised in the following way. Section 2 featu res Table 1, a summary of the literature that concerns the use of ANNs for nonparametric pricing (and hedging) of options. Section 3 provides a list of recommended papers from Table 1. Section 4 provides a n overview of related work where ANNs are applied in the context of option pricing and hedging, but not necessarily as nonparametric estimation tools. Section 5 briefly discusses various regularisation techniq ues used in the reviewed literature.
This paper considers improved forecasting in possibly nonlinear dynamic settings, with high-dimension predictors ("big data" environments). To overcome the curse of dimensionality and manage data and model complexity, we examine shrinkage estimation of a back-propagation algorithm of a deep neural net with skip-layer connections. We expressly include both linear and nonlinear components. This is a high-dimensional learning approach including both sparsity L1 and smoothness L2 penalties, allowing high-dimensionality and nonlinearity to be accommodated in one step. This approach selects significant predictors as well as the topology of the neural network. We estimate optimal values of shrinkage hyperparameters by incorporating a gradient-based optimization technique resulting in robust predictions with improved reproducibility. The latter has been an issue in some approaches. This is statistically interpretable and unravels some network structure, commonly left to a black box. An additional advantage is that the nonlinear part tends to get pruned if the underlying process is linear. In an application to forecasting equity returns, the proposed approach captures nonlinear dynamics between equities to enhance forecast performance. It offers an appreciable improvement over current univariate and multivariate models by RMSE and actual portfolio performance.