Discrete GCBF Proximal Policy Optimization for Multi-agent Safe Optimal Control