Decomposing Behavioral Phase Transitions in LLMs: Order Parameters for Emergent Misalignment

Open in new window