On Perceptual Lossy Compression: The Cost of Perceptual Reconstruction and An Optimal Training Framework