PACA: Perspective-Aware Cross-Attention Representation for Zero-Shot Scene Rearrangement