School of Reward Hacks: Hacking harmless tasks generalizes to misaligned behavior in LLMs