Continuous allocation tasks are a class of problems where an agent needs to distribute a limited amount of resources over a set of entities at each time step.
Reinforcement learning (RL) studies the problem where an agent interacts with an unknown environment to optimize cumulative rewards/losses [Sutton and Barto, 2018].
Imitating humans' mental world knowledge model which provides global prior knowledge before the task and maintains local dynamic knowledge during the task, in this paper, we introduce parametric W orld K nowledge M odel ( WKM) to facilitate agent