Autoregressive Policy Optimization for Constrained Allocation Tasks