Spark Transformer: Reactivating Sparsity in Transformer FFN and Attention

Open in new window