Towards Generalizable Agents in Text-Based Educational Environments: A Study of Integrating RL with LLMs