Goto

Collaborating Authors

 Asia




Stochastic Online Greedy Learning with Semi-bandit Feedbacks

Neural Information Processing Systems

In this paper, we address the online learning problem when the input to the greedy algorithm is stochastic with unknown parameters that have to be learned over time.








DiscoveringSparsityAllocationforLayer-wise PruningofLargeLanguageModels

Neural Information Processing Systems

In this paper, we present DSA, the first automated framework for discovering sparsity allocation schemes for layer-wise pruning in Large Language Models (LLMs). LLMs have become increasingly powerful, but their large parameter counts make them computationally expensive. Existing pruning methods for compressing LLMs primarily focus on evaluating redundancies and removing element-wise weights. However, these methods fail to allocate adaptive layerwise sparsities, leading to performance degradation in challenging tasks.