Off-Policy Evaluation with Policy-Dependent Optimization Response