When Does Supervised Training Pay Off? The Hidden Economics of Object Detection in the Era of Vision-Language Models