Goto

Collaborating Authors

 misew





TGEA 2.0 Supplementary Materials A Appendix

Neural Information Processing Systems

Table 2: The number of erroneous texts generated with different decoding strategies. Figure 2: The distribution of MiSEW over the number of tokens contained in each MiSEW . We have fine-tuned several commonly used Chinese PLMs as baselines. All models have 12 attention heads and the hidden size is 768. We train these models on 8 Tesla P100 with 16G memory.