A Reinforcement Learning Algorithm in Partially Observable Environments Using Short-Term Memory