choppedtomato
4f2accafe6fa355624f3ee42207cc7b8-Supplemental-Conference.pdf
A.1 DomainSpecificLanguage(DSL)Specifications Table 5 shows the domain-specific language (DSL) designed for E-MAPP in theOvercooked-v2 environment. Each convolutional layer has a kernel size of3except for the first one, which has a kernel sizeof5. The inventory statesinv is encoded by a three-layer MLP with hidden size 128 for all layers. The output goal featurefgoal is a640-dim feature vector.fgoal Name Value learningrate 3e-4 updatebatchsize 128 In cooperative settings, the goal input of the assistive agent is the leading agent's goal.
A Appendix
Algorithm 1 shows the execution rules of parallel programs. Terminate the program if no subsequent subroutine exists. Compute the cost of each possible allocation based on the auxiliary functions. The common hyperparameters are listed below. Name V alue learning rate 3e-4 training steps 10M update batch size 256 number of rollout threads 8 rollout buffer size 4096 8 weight of value loss 0.1 weight of policy loss 1 weight of entropy loss 0.01 In cooperative settings, the goal input of the assistive agent is the leading agent's goal.