Simplex Decomposition for Portfolio Allocation Constraints in Reinforcement Learning