Large Language Model
DiscoveringSparsityAllocationforLayer-wise PruningofLargeLanguageModels
In this paper, we present DSA, the first automated framework for discovering sparsity allocation schemes for layer-wise pruning in Large Language Models (LLMs). LLMs have become increasingly powerful, but their large parameter counts make them computationally expensive. Existing pruning methods for compressing LLMs primarily focus on evaluating redundancies and removing element-wise weights. However, these methods fail to allocate adaptive layerwise sparsities, leading to performance degradation in challenging tasks.
BAKU: AnEfficientTransformerfor Multi-TaskPolicyLearning
Inthiswork,wepresentBAKU,asimple transformer architecture that enables efficient learning of multi-task robot policies.BAKU builds upon recent advancements in offline imitation learning and meticulously combines observation trunks, action chunking, multi-sensory observations, and action heads tosubstantially improveupon prior work.