On sampling from data with duplicate records