address the concerns as follows

Neural Information Processing Systems 

We thank the reviewers for their constructive feedback. We also used normalized gradient and ɛ-ball projection and we'll The variable p is a "dual" variable. The algorithm uses an iterative scheme to find it. Hamiltonian is the same as the slack variable p. It can also be understood as a Fréchet Dual of the original problem.