It is hard to underestimate the role of E-commerce in a world where most communications happen on the web and our virtual environment is full of advertisements with attractive products and services to buy. Meanwhile, it is obvious that many criminals are trying to take advantage of it, using scams and malware to compromise users' data. The level of E-commerce fraud is high, according to the statistics. With E-commerce sales estimated to reach $630 billion (or more) in 2020, an estimated $16 billion will be lost because of fraud. Amazon accounts for almost a third of all E-commerce deals in the United States; Amazon's sales numbers increase by about 15% to 20% each year. From 2018 to 2019, E-commerce spending increased by 57% -- the third time in U.S. history that the money spent shopping online exceeded the amount of money spent in brick-and-mortar stores. The Crowe UK and Centre for Counter Fraud Studies (CCFS) created Europe's most complete database of information on fraud, with data from more than 1,300 enterprises from almost every economic field.
Credit card fraud is an ongoing problem for almost all industries in the world, and it raises millions of dollars to the global economy each year. Therefore, there is a number of research either completed or proceeding in order to detect these kinds of frauds in the industry. These researches generally use rule-based or novel artificial intelligence approaches to find eligible solutions. The ultimate goal of this paper is to summarize state-of-the-art approaches to fraud detection using artificial intelligence and machine learning techniques. While summarizing, we will categorize the common problems such as imbalanced dataset, real time working scenarios, and feature engineering challenges that almost all research works encounter, and identify general approaches to solve them. The imbalanced dataset problem occurs because the number of legitimate transactions is much higher than the fraudulent ones whereas applying the right feature engineering is substantial as the features obtained from the industries are limited, and applying feature engineering methods and reforming the dataset is crucial. Also, adapting the detection system to real time scenarios is a challenge since the number of credit card transactions in a limited time period is very high. In addition, we will discuss how evaluation metrics and machine learning methods differentiate among each research. NTRODUCTION The number of cashless transactions is at its peak point since the beginning of the digital era and it is most likely to increase in the future.
Fraud is a billion-dollar business and expands rapidly year by year. Thousands of people fall victim to it. Fraud always includes a false statement, misinterpretation, or deceitful conduct. Common varieties of fraud offenses include identity theft, insurance fraud, credit/debit card fraud, and mail fraud. The PwC global economic crime survey of 2018 (PwC, 2018) found that about half of the 7,200 surveyed enterprises had already experienced fraud of some kind. This is an increase compared to the PwC survey conducted in 2016 (PwC, 2016), in which slightly more than a third of organizations surveyed had experienced economic crime.
Machine learning and data mining techniques have been used extensively in order to detect credit card frauds. However, most studies consider credit card transactions as isolated events and not as a sequence of transactions. In this framework, we model a sequence of credit card transactions from three different perspectives, namely (i) The sequence contains or doesn't contain a fraud (ii) The sequence is obtained by fixing the card-holder or the payment terminal (iii) It is a sequence of spent amount or of elapsed time between the current and previous transactions. Combinations of the three binary perspectives give eight sets of sequences from the (training) set of transactions. Each one of these sequences is modelled with a Hidden Markov Model (HMM). Each HMM associates a likelihood to a transaction given its sequence of previous transactions. These likelihoods are used as additional features in a Random Forest classifier for fraud detection. Our multiple perspectives HMM-based approach offers automated feature engineering to model temporal correlations so as to improve the effectiveness of the classification task and allows for an increase in the detection of fraudulent transactions when combined with the state of the art expert based feature engineering strategy for credit card fraud detection. In extension to previous works, we show that this approach goes beyond ecommerce transactions and provides a robust feature engineering over different datasets, hyperparameters and classifiers. Moreover, we compare strategies to deal with structural missing values.
Fraud volumes have recently increased by 7.3% for U.S. e-commerce and retail merchants. It seems only natural that the main e-commerce frauds happen in the payment stage: identity theft, phishing, account theft and others. While fraudsters have become more sophisticated, so has fraud prevention. The most common type of a fraud scenario, identity theft can have effects of different scope. Someone who steals your information to commit fraud can use it however they please, from a credit card swindle to tax reports and medical services.