Verify with Caution: The Pitfalls of Relying on Imperfect Factuality Metrics