Collaborating Authors


We're Still Awash In Paper: Insights Into How Industry Digitization Is Enabling AI


The idea of the paperless office has been around for decades yet few companies have been able to achieve anything close to this idea. With all the technological advances we've had, why is it still so hard for companies to move away from their people and paper based processes? It may come as little surprise that organizations are still awash in paper. However, for companies looking to gain insights and extract value from their data, they need to get that data into a state where computers are able to process it. To do that, companies need to digitize and digitalize their information and processes.

Reducing bias in AI-based financial services


Artificial intelligence (AI) presents an opportunity to transform how we allocate credit and risk, and to create fairer, more inclusive systems. AI's ability to avoid the traditional credit reporting and scoring system that helps perpetuate existing bias makes it a rare, if not unique, opportunity to alter the status quo. However, AI can easily go in the other direction to exacerbate existing bias, creating cycles that reinforce biased credit allocation while making discrimination in lending even harder to find. Will we unlock the positive, worsen the negative, or maintain the status quo by embracing new technology? This paper proposes a framework to evaluate the impact of AI in consumer lending. The goal is to incorporate new data and harness AI to expand credit to consumers who need it on better terms than are currently provided. It builds on our existing system's dual goals of pricing financial services based on the true risk the individual consumer poses while aiming to prevent discrimination (e.g., race, gender, DNA, marital status, etc.).

Stikkum Announces Enhanced Version of Its Mortgage Retention Alert & Automation Platform


Platform Updates Address Mortgage Lending Industry Challenges of Customer Loyalty and Engagement, While Demonstrating the Company's Commitment to Innovation Within the Mortgage Industry. Stikkum, a leading technology innovator in the mortgage client retention space, announced the launch of its latest version of its mortgage retention alert and automation platform. The platform enhancements strengthen the way mortgage brokers and bank loan officers can reconnect, contact, and engage existing mortgage client relationships. Based on extensive market research and customer feedback, the company has expanded its platform to accelerate provider growth by addressing key challenges plaguing the industry. "Since Stikkum is designed specifically for the mortgage industry, we prioritize staying on top of off-market trends and incorporating customer insights to make dynamic solutions that help our customers achieve success," said Stikkum Managing Partner Jeff Londres.

Ticker: Bejeweled, Centipede among hall of famers; Mortgages at all-time low; 1.5M new unemployment applications

Boston Herald

The World Video Game Hall of Fame inducted Bejeweled, Centipede, King's Quest and Minecraft in a virtual ceremony Thursday that recognized their influence on the industry and the gamers who have spent tens of billions of hours playing them. The hall of fame's Class of 2020 was chosen from a field of 12 finalists that also included Frogger, Goldeneye 007, Guitar Hero, NBA Jam, Nokia Snake, Super Smash Bros. Melee, Uncharted 2, and Where in the World is Carmen Sandiego? Long-term U.S. mortgage rates fell this week as the benchmark 30-year home loan reached a new all-time low. Mortgage buyer Freddie Mac reported Thursday that the average rate on the key 30-year loan declined to 3.13% from 3.21% last week. It was the lowest level since Freddie began tracking average rates in 1971.

Custom DU: A Web-Based Business User-Driven Automated Underwriting System

AI Magazine

Custom DU is an automated underwriting system that enables mortgage lenders to build their own business rules that facilitate assessing borrower eligibility for different mortgage products. Developed by Fannie Mae, Custom DU has been used since 2004 by several lenders to automate the underwriting of numerous mortgage products. Custom DU uses rule specification language techniques and a web-based, user-friendly interface for implementing business rules that represent business policy. By means of the user interface, lenders can also customize their underwriting findings reports, test the rules that they have defined, and publish changes to business rules on a real-time basis, all without any software modifications. The user interface enforces structure and consistency, enabling business users to focus on their underwriting guidelines when converting their business policy to rules.

Announcing the First ODSC Europe 2020 Virtual Conference Speakers


ODSC's first virtual conference is a wrap, and now we've started planning for our next one, the ODSC Europe 2020 Virtual Conference from September 17th to the 19th. We're thrilled to announce the first group of expert speakers to join. During the event, speakers will cover topics such as NLP machine learning quant finance deep learning data visualization data science for good image classification transfer learning recommendation systems and much, much more. Dr. Jiahong Zhong is the Head of Data Science at Zopa LTD, which facilitates peer-to-peer lending and is one of the United Kingdom's earliest fintech companies. Before joining Zopa, Zhong worked as a researcher on the Large Hadron Collider Project at CERN, focusing on statistics, distributed computing, and data analysis.

XtracTree for Regulator Validation of Bagging Methods Used in Retail Banking Artificial Intelligence

Bootstrap aggregation, known as bagging, is one of the most popular ensemble methods used in machine learning (ML). An ensemble method is a supervised ML method that combines multiple hypotheses to form a single hypothesis used for prediction. A bagging algorithm combines multiple classifiers modelled on different sub-samples of the same data set to build one large classifier. Large retail banks are nowadays using the power of ML algorithms, including decision trees and random forests, to optimize the retail banking activities. However, AI bank researchers face a strong challenge from their own model validation department as well as from national financial regulators. Each proposed ML model has to be validated and clear rules for every algorithm-based decision have to be established. In this context, we propose XtracTree, an algorithm that is capable of effectively converting an ML bagging classifier, such as a decision tree or a random forest, into simple "if-then" rules satisfying the requirements of model validation. Our algorithm is also capable of highlighting the decision path for each individual sample or a group of samples, addressing any concern from the regulators regarding ML "black-box". We use a public loan data set from Kaggle to illustrate the usefulness of our approach. Our experiments indicate that, using XtracTree, we are able to ensure a better understanding for our model, leading to an easier model validation by national financial regulators and the internal model validation department.

Abstracting Fairness: Oracles, Metrics, and Interpretability Machine Learning

It is well understood that classification algorithms, for example, for deciding on loan applications, cannot be evaluated for fairness without taking context into account. We examine what can be learned from a fairness oracle equipped with an underlying understanding of ``true'' fairness. The oracle takes as input a (context, classifier) pair satisfying an arbitrary fairness definition, and accepts or rejects the pair according to whether the classifier satisfies the underlying fairness truth. Our principal conceptual result is an extraction procedure that learns the underlying truth; moreover, the procedure can learn an approximation to this truth given access to a weak form of the oracle. Since every ``truly fair'' classifier induces a coarse metric, in which those receiving the same decision are at distance zero from one another and those receiving different decisions are at distance one, this extraction process provides the basis for ensuring a rough form of metric fairness, also known as individual fairness. Our principal technical result is a higher fidelity extractor under a mild technical constraint on the weak oracle's conception of fairness. Our framework permits the scenario in which many classifiers, with differing outcomes, may all be considered fair. Our results have implications for interpretablity -- a highly desired but poorly defined property of classification systems that endeavors to permit a human arbiter to reject classifiers deemed to be ``unfair'' or illegitimately derived.

Predicting Performance of Asynchronous Differentially-Private Learning Machine Learning

We consider training machine learning models using Training data located on multiple private and geographically-scattered servers with different privacy settings. Due to the distributed nature of the data, communicating with all collaborating private data owners simultaneously may prove challenging or altogether impossible. In this paper, we develop differentially-private asynchronous algorithms for collaboratively training machine-learning models on multiple private datasets. The asynchronous nature of the algorithms implies that a central learner interacts with the private data owners one-on-one whenever they are available for communication without needing to aggregate query responses to construct gradients of the entire fitness function. Therefore, the algorithm efficiently scales to many data owners. We define the cost of privacy as the difference between the fitness of a privacy-preserving machine-learning model and the fitness of trained machine-learning model in the absence of privacy concerns. We prove that we can forecast the performance of the proposed privacy-preserving asynchronous algorithms. We demonstrate that the cost of privacy has an upper bound that is inversely proportional to the combined size of the training datasets squared and the sum of the privacy budgets squared. We validate the theoretical results with experiments on financial and medical datasets. The experiments illustrate that collaboration among more than 10 data owners with at least 10,000 records with privacy budgets greater than or equal to 1 results in a superior machine-learning model in comparison to a model trained in isolation on only one of the datasets, illustrating the value of collaboration and the cost of the privacy. The number of the collaborating datasets can be lowered if the privacy budget is higher.

Fair Bandit Learning with Delayed Impact of Actions Machine Learning

Algorithmic fairness has been studied mostly in a static setting where the implicit assumptions are that the frequencies of historically made decisions do not impact the problem structure in subsequent future. However, for example, the capability to pay back a loan for people in a certain group might depend on historically how frequently that group has been approved loan applications. If banks keep rejecting loan applications to people in a disadvantaged group, it could create a feedback loop and further damage the chance of getting loans for people in that group. This challenge has been noted in several recent works but is under-explored in a more generic sequential learning setting. In this paper, we formulate this delayed and long-term impact of actions within the context of multi-armed bandits (MAB). We generalize the classical bandit setting to encode the dependency of this action "bias" due to the history of the learning. Our goal is to learn to maximize the collected utilities over time while satisfying fairness constraints imposed over arms' utilities, which again depend on the decision they have received. We propose an algorithm that achieves a regret of $\tilde{\mathcal{O}}(KT^{2/3})$ and show a matching regret lower bound of $\Omega(KT^{2/3})$, where $K$ is the number of arms and $T$ denotes the learning horizon. Our results complement the bandit literature by adding techniques to deal with actions with long-term impacts and have implications in designing fair algorithms.