Feature Leakage, and identifying it with Exploratory data analysis and Machine Learning
On one of my projects, my team and I were tasked with building a mortgage leads generation model for a client -- a quite standard project in the banking industry. The data shared with us, on the other hand, were not safe or fit for modelling straightaway: the data had been compiled from different sources, "possibly from different time periods" too. This might seem like an exceptional situation. In reality, however, it is all too common to acquire data from the client and take for granted their fitness for modelling. In our situation, the comment from our client meant that we were potentially looking at feature leakage.
Jul-31-2020, 07:12:05 GMT
- Technology: