Appendix A Latency Driven Slimming Algorithm

Neural Information Processing Systems 

We provide the details of the proposed latency-driven fast slimming in Alg. 1. Formulations of the Our major conclusions and speed analysis can be found in Sec. 3 and Figure 1. Compared to non-overlap large-kernel patch embedding (V5 in Tab. MHSA with the global receptive field is an essential contribution to model performance. By comparing V1 and V2 in Tab. 3, we can observe that the GN We explore ReLU and HardSwish (V3 and V4 in Tab. 3) in addition to GeLU We draw a conclusion that the activation function can be selected on a case-by-case basis depending on the specific hardware and compiler. In this work, we use GeLU to provide better performance than ReLU while executing faster.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found