Active Learning of Hierarchical Policies from State-Action Trajectories