Value-Penalized Auxiliary Control from Examples for Learning without Rewards or Demonstrations