Beyond Accuracy: Automated De-Identification of Large Real-World Clinical Text Datasets