Goto

Collaborating Authors

 Kampffmeyer, Michael


Deep Generative Models for Reject Inference in Credit Scoring

arXiv.org Machine Learning

Credit scoring models based on accepted applications may be biased and their consequences can have a statistical and economic impact. Reject inference is the process of attempting to infer the creditworthiness status of the rejected applications. In this research, we use deep generative models to develop two new semi-supervised Bayesian models for reject inference in credit scoring, in which we model the data generating process to be dependent on a Gaussian mixture. The goal is to improve the classification accuracy in credit scoring models by adding reject applications. Our proposed models infer the unknown creditworthiness of the rejected applications by exact enumeration of the two possible outcomes of the loan (default or non-default). The efficient stochastic gradient optimization technique used in deep generative models makes our models suitable for large data sets. Finally, the experiments in this research show that our proposed models perform better than classical and alternative machine learning models for reject inference in credit scoring.


Learning Latent Representations of Bank Customers With The Variational Autoencoder

arXiv.org Machine Learning

Learning data representations that reflect the customers' creditworthiness can improve marketing campaigns, customer relationship management, data and process management or the credit risk assessment in retail banks. In this research, we adopt the Variational Autoencoder (VAE), which has the ability to learn latent representations that contain useful information. We show that it is possible to steer the latent representations in the latent space of the VAE using the Weight of Evidence and forming a specific grouping of the data that reflects the customers' creditworthiness. Our proposed method learns a latent representation of the data, which shows a well-defied clustering structure capturing the customers' creditworthiness. These clusters are well suited for the aforementioned banks' activities.


Deep Divergence-Based Approach to Clustering

arXiv.org Machine Learning

A promising direction in deep learning research consists in learning representations and simultaneously discovering cluster structure in unlabeled data by optimizing a discriminative loss function. As opposed to supervised deep learning, this line of research is in its infancy, and how to design and optimize suitable loss functions to train deep neural networks for clustering is still an open question. Our contribution to this emerging field is a new deep clustering network that leverages the discriminative power of information-theoretic divergence measures, which have been shown to be effective in traditional clustering. We propose a novel loss function that incorporates geometric regularization constraints, thus avoiding degenerate structures of the resulting clustering partition. Experiments on synthetic benchmarks and real datasets show that the proposed network achieves competitive performance with respect to other state-of-the-art methods, scales well to large datasets, and does not require pre-training steps.


Recurrent Deep Divergence-based Clustering for simultaneous feature learning and clustering of variable length time series

arXiv.org Machine Learning

The task of clustering unlabeled time series and sequences entails a particular set of challenges, namely to adequately model temporal relations and variable sequence lengths. If these challenges are not properly handled, the resulting clusters might be of suboptimal quality. As a key solution, we present a joint clustering and feature learning framework for time series based on deep learning. For a given set of time series, we train a recurrent network to represent, or embed, each time series in a vector space such that a divergence-based clustering loss function can discover the underlying cluster structure in an end-to-end manner. Unlike previous approaches, our model inherently handles multivariate time series of variable lengths and does not require specification of a distance-measure in the input space. On a diverse set of benchmark datasets we illustrate that our proposed Recurrent Deep Divergence-based Clustering approach outperforms, or performs comparable to, previous approaches.


Reinforced Auto-Zoom Net: Towards Accurate and Fast Breast Cancer Segmentation in Whole-slide Images

arXiv.org Artificial Intelligence

Convolutional neural networks have led to significant breakthroughs in the domain of medical image analysis. However, the task of breast cancer segmentation in whole-slide images (WSIs) is still underexplored. WSIs are large histopathological images with extremely high resolution. Constrained by the hardware and field of view, using high-magnification patches can slow down the inference process and using low-magnification patches can cause the loss of information. In this paper, we aim to achieve two seemingly conflicting goals for breast cancer segmentation: accurate and fast prediction. We propose a simple yet efficient framework Reinforced Auto-Zoom Net (RAZN) to tackle this task. Motivated by the zoom-in operation of a pathologist using a digital microscope, RAZN learns a policy network to decide whether zooming is required in a given region of interest. Because the zoom-in action is selective, RAZN is robust to unbalanced and noisy ground truth labels and can efficiently reduce overfitting. We evaluate our method on a public breast cancer dataset. RAZN outperforms both single-scale and multi-scale baseline approaches, achieving better accuracy at low inference cost.


Uncertainty and Interpretability in Convolutional Neural Networks for Semantic Segmentation of Colorectal Polyps

arXiv.org Machine Learning

Convolutional Neural Networks (CNNs) are propelling advances in a range of different computer vision tasks such as object detection and object segmentation. Their success has motivated research in applications of such models for medical image analysis. If CNN-based models are to be helpful in a medical context, they need to be precise, interpretable, and uncertainty in predictions must be well understood. In this paper, we develop and evaluate recent advances in uncertainty estimation and model interpretability in the context of semantic segmentation of polyps from colonoscopy images. We evaluate and enhance several architectures of Fully Convolutional Networks (FCNs) for semantic segmentation of colorectal polyps and provide a comparison between these models. Our highest performing model achieves a 76.06\% mean IOU accuracy on the EndoScene dataset, a considerable improvement over the previous state-of-the-art.


Geometric Generalization Based Zero-Shot Learning Dataset Infinite World: Simple Yet Powerful

arXiv.org Machine Learning

Raven's Progressive Matrices are one of the widely used tests in evaluating the human test taker's fluid intelligence. Analogously, this paper introduces geometric generalization based zero-shot learning tests to measure the rapid learning ability and the internal consistency of deep generative models. Our empirical research analysis on state-of-the-art generative models discern their ability to generalize concepts across classes. In the process, we introduce Infinite World, an evaluable, scalable, multi-modal, light-weight dataset and Zero-Shot Intelligence Metric ZSI. The proposed tests condenses human-level spatial and numerical reasoning tasks to its simplistic geometric forms. The dataset is scalable to a theoretical limit of infinity, in numerical features of the generated geometric figures, image size and in quantity. We systematically analyze state-of-the-art model's internal consistency, identify their bottlenecks and propose a pro-active optimization method for few-shot and zero-shot learning.


Segment-Based Credit Scoring Using Latent Clusters in the Variational Autoencoder

arXiv.org Machine Learning

Identifying customer segments in retail banking portfolios with different risk profiles can improve the accuracy of credit scoring. The Variational Autoencoder (VAE) has shown promising results in different research domains, and it has been documented the powerful information embedded in the latent space of the VAE. We use the VAE and show that transforming the input data into a meaningful representation, it is possible to steer configurations in the latent space of the VAE. Specifically, the Weight of Evidence (WoE) transformation encapsulates the propensity to fall into financial distress and the latent space in the VAE preserves this characteristic in a well-defined clustering structure. These clusters have considerably different risk profiles and therefore are suitable not only for credit scoring but also for marketing and customer purposes. This new clustering methodology offers solutions to some of the challenges in the existing clustering algorithms, e.g., suggests the number of clusters, assigns cluster labels to new customers, enables cluster visualization, scales to large datasets, captures non-linear relationships among others. Finally, for portfolios with a large number of customers in each cluster, developing one classifier model per cluster can improve the credit scoring assessment.


Deep Kernelized Autoencoders

arXiv.org Machine Learning

In this paper we introduce the deep kernelized autoencoder, a neural network model that allows an explicit approximation of (i) the mapping from an input space to an arbitrary, user-specified kernel space and (ii) the back-projection from such a kernel space to input space. The proposed method is based on traditional autoencoders and is trained through a new unsupervised loss function. During training, we optimize both the reconstruction accuracy of input samples and the alignment between a kernel matrix given as prior and the inner products of the hidden representations computed by the autoencoder. Kernel alignment provides control over the hidden representation learned by the autoencoder. Experiments have been performed to evaluate both reconstruction and kernel alignment performance. Additionally, we applied our method to emulate kPCA on a denoising task obtaining promising results.