Collaborating Authors

The GE Compliance Checker: A Generic Tool for Assessing Mortgage Loan Resale Requirements

AAAI Conferences

This paper describes the GE Compliance Checker [GECCO], a knowledge-based application for use in the home mortgage industry. GECCO is a tool for automating the information-intensive processes of underwriting and reselling mortgage loans. GECCO was initially designed and deployed for one business component of GE Capital Mortgage Corporation [GECMC], and then successfully integrated into three other GECMC businesses. Its first application was for third-party underwriting. This was followed by the use of GECCO in wholesale pricing and registration, and in direct loan origination. Most recently, GECCO has evolved into a commercial product offered for purchase to mortgage lenders. GECCO has significantly improved the underwriting and resale process: quality control has become much more effective, adding consistency, completeness and robustness to the decision making process; the quantity of loans processed has increased; customer service has been enhanced; and a once "subjective" process has now been standardized. The successful use of AI has also permeated GECMC business application software to the extent that AI has become a requirement rather than a remote technology used in an isolated application.

Assessing Algorithmic Fairness with Unobserved Protected Class Using Data Combination Machine Learning

The increasing impact of algorithmic decisions on people's lives compels us to scrutinize their fairness and, in particular, the disparate impacts that ostensibly-color-blind algorithms can have on different groups. Examples include credit decisioning, hiring, advertising, criminal justice, personalized medicine, and targeted policymaking, where in some cases legislative or regulatory frameworks for fairness exist and define specific protected classes. In this paper we study a fundamental challenge to assessing disparate impacts in practice: protected class membership is often not observed in the data. This is particularly a problem in lending and healthcare. We consider the use of an auxiliary dataset, such as the US census, that includes class labels but not decisions or outcomes. We show that a variety of common disparity measures are generally unidentifiable aside for some unrealistic cases, providing a new perspective on the documented biases of popular proxy-based methods. We provide exact characterizations of the sharpest-possible partial identification set of disparities either under no assumptions or when we incorporate mild smoothness constraints. We further provide optimization-based algorithms for computing and visualizing these sets, which enables reliable and robust assessments -- an important tool when disparity assessment can have far-reaching policy implications. We demonstrate this in two case studies with real data: mortgage lending and personalized medicine dosing.

Investigating bankruptcy prediction models in the presence of extreme class imbalance and multiple stages of economy Machine Learning

In the area of credit risk analytics, current Bankruptcy Prediction Models (BPMs) struggle with (a) the availability of comprehensive and real-world data sets and (b) the presence of extreme class imbalance in the data (i.e., very few samples for the minority class) that degrades the performance of the prediction model. Moreover, little research has compared the relative performance of well-known BPM's on public datasets addressing the class imbalance problem. In this work, we apply eight classes of well-known BPMs, as suggested by a review of decades of literature, on a new public dataset named Freddie Mac Single-Family Loan-Level Dataset with resampling (i.e., adding synthetic minority samples) of the minority class to tackle class imbalance. Additionally, we apply some recent AI techniques (e.g., tree-based ensemble techniques) that demonstrate potentially better results on models trained with resampled data. In addition, from the analysis of 19 years (1999-2017) of data, we discover that models behave differently when presented with sudden changes in the economy (e.g., a global financial crisis) resulting in abrupt fluctuations in the national default rate. In summary, this study should aid practitioners/researchers in determining the appropriate model with respect to data that contains a class imbalance and various economic stages.

Automating the Underwriting of Insurance Applications

AI Magazine

An end-to-end system was created at Genworth Financial to automate the underwriting of long-term care (LTC) and life insurance applications. Relying heavily on artificial intelligence techniques, the system has been in production since December 2002 and in 2004 completely automates the underwriting of 19 percent of the LTC applications. A fuzzy logic rules engine encodes the underwriter guidelines and an evolutionary algorithm optimizes the engine's performance. Finally, a natural language parser is used to improve the coverage of the underwriting system.

Designing Quality into Expert Systems: A Case Study in Automated Insurance Underwriting

AAAI Conferences

It can be difficult to design and develop artificial intelligence systems to meet specific quality standards. Often, AI systems are designed to be "as good as possible" rather than meeting particular targets. Using the Design for Six Sigma quality methodology, an automated insurance underwriting expert system was designed, developed, and fielded. Using this methodology resulted in meeting the high quality expectations required for deployment.