Why is Vlookup (in Excel) 1,000 times slower than hash tables in Python?
The easy answer is that excel is a resource hog & despite being a tremendously powerful tool for mining small datasets, when you start to push past the traditional 65565 rows, you start to move into realms where Microsoft is traditionally not good (memory handling, i/o management, efficient processing). First a few questions: 1. Cardinality: Were both tables internally unique? Excel bogs down on cartesian products in my experience -- you need one to many or one to one matches. Am assuming you ran a pivot table on both counting the unique instances of each email address, comparing the row count in the table with the grand total (both should be the same). Am assuming that this answer is going to be yes across the board, since you're probably a really good excel jockey.
Nov-7-2016, 10:10:03 GMT
- Technology: