Training Transition Policies via Distribution Matching for Complex Tasks