Layerwise Importance Analysis of Feed-Forward Networks in Transformer-based Language Models

Open in new window