6f5216f8d89b086c18298e043bfe48ed-AuthorFeedback.pdf
–Neural Information Processing Systems
Genral response: We thank all reviewers for their constructive comments. Below is our response for common questions. Q2. broader impact (R2 & R3): For the positive side, as is detailed in the Broader Impact section, DynaBERT (i) BERT models; and (iii) is more environmentally friendly due to weight sharing. Reviewer 1 Q1."whether this approach can be adapted to work during the pre-training phase": Below we show We compare with separately pre-trained small models in Google BERT repository (https://github.com/ For depth, we adjust the number of layers to be L = 4, 6.
Neural Information Processing Systems
May-29-2025, 16:43:14 GMT
- Technology: