Labels Matter More Than Models: Quantifying the Benefit of Supervised Time Series Anomaly Detection