Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning

Open in new window