Scaling Policy Compliance Assessment in Language Models with Policy Reasoning Traces