Measuring and Controlling Instruction (In)Stability in Language Model Dialogs