Su
In this paper, we propose to use machine learning to automate Medicare fraud detection. By cross checking Medicare payment database and provider exclusion database, we build datasets with millions of service providers, including a handful of convicted fraudulent service providers. One essential challenge is that the dataset created is extremely imbalanced, making it extremely difficult to learn accurate classifiers for fraud detection. To tackle the challenge, we first use feature engineering to design effective features, by taking the difference between each service provider and its group cohort into consideration. At the instance level, we also use a synthetic instance generation approach to generate positive samples to alleviate the data imbalance challenge.
Feb-8-2022, 11:19:13 GMT
- Industry:
- Technology: