Sample-Efficient Policy Learning based on Completely Behavior Cloning