Goto

Collaborating Authors

 Large Language Model






Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search

Neural Information Processing Systems

Given the description of an environment and a task, we use an LLM guided by the GIF-MCTS method to iteratively generate and refine a candidate CWM. The candidate's correctness is evaluated by checking if it correctly


88dddaf430b5bc38ab8228902bb61821-Supplemental-Conference.pdf

Neural Information Processing Systems

Supplementary figure 1. Ablanullon study, each row represents the ablated layer and each column the module that is ablated from that layer, for example the first panel shows ablanullon of anullennullon - key in layer 5. Different layers in GPT2 - XL model were ablated and the consequence of ablanullon on curvature measured for 2000 sentences in UD corpus. Red bar shows the layer where ablanullon was applied. AB Supplementary figure 3. A. curvature values for sampled 2000 sentence in RWKV model ( RNN) for both trained an untrained version. B correlanullon between model generated surprisal and curvature in RWKV model. Diamonds: syntacnullc surprisal Supplementary figure 5: E ffect of different decoding strategies in GPT2 - XL sequence generanullon and its comparison to ground - truth(true) same as figure 4b in the main manuscript.