Efficient Exploration with Self-Imitation Learning via Trajectory-Conditioned Policy