Minitron-SSM: Efficient Hybrid Language Model Compression through Group-Aware SSMPruning

Open in new window