COPR: Continual Human Preference Learning via Optimal Policy Regularization