A Supplementary Material

Neural Information Processing Systems 

Table 6: Mercury-eval encompasses 256 tasks, the difficulty of which has been balanced for model evaluation. Specifically, there are two primary constraints: a time limit and a memory limit. The memory limit caps the amount of RAM that a process can consume. The sandbox employs an isolated file system to provide a safe execution environment for the code. This is done to prevent code from using certain functions directly from the host's system libraries, which could result in unpredictable behavior Figure 5 shows the overview of the code execution pipeline.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found