Noise Contrastive Alignment of Language Models with Explicit Rewards