RL-Obfuscation: Can Language Models Learn to Evade Latent-Space Monitors?