Online Markov Decision Processes with Terminal Law Constraints