DE$^3$-BERT: Distance-Enhanced Early Exiting for BERT based on Prototypical Networks

Open in new window