Fuzzy Bootstrap Matching - DataScienceCentral.com
This paper discusses techniques for merging data files where no key field exists between the files. The paper will illustrate an approach to resolve two issues that are common to most fuzzy matching techniques: 1) how to weight proxy identifier fields, and 2) how to measure the Type One and Type Two errors of the merge estimation algorithm. A common requirement in analytics is to merge records in two or more large sets of information (i.e., thousands if not millions of records) where no exact key exists to match records between the information sets. When no exact key between the two data sets exists, a common merging solution is to use "fuzzy" matching. "Fuzzy" matching uses proxy keys as substitute keys to match records between the two data files.
Feb-26-2022, 06:56:25 GMT
- Country:
- Africa > Cameroon
- Gulf of Guinea (0.05)
- North America > United States (0.30)
- Africa > Cameroon
- Genre:
- Research Report (0.51)
- Industry:
- Health & Medicine (0.30)
- Technology: