Monte Carlo Policy Gradient Method for Binary Optimization