Neuronal Activation States as Sample Embeddings for Data Selection in Task-Specific Instruction Tuning
Ma, Da, Shang, Gonghu, Chen, Zhi, Qin, Libo, Luo, Yijie, Pan, Lei, Fan, Shuai, Chen, Lu, Yu, Kai
–arXiv.org Artificial Intelligence
Task-specific instruction tuning enhances the performance of large language models (LLMs) on specialized tasks, yet efficiently selecting relevant data for this purpose remains a challenge. Inspired by neural coactivation in the human brain, we propose a novel data selection method called NAS, which leverages neuronal activation states as embeddings for samples in the feature space. Extensive experiments show that NAS outperforms classical data selection methods in terms of both effectiveness and robustness across different models, datasets, and selection ratios.
arXiv.org Artificial Intelligence
Mar-19-2025
- Country:
- Asia (0.93)
- North America > United States
- Minnesota > Hennepin County > Minneapolis (0.14)
- Genre:
- Research Report (0.64)
- Workflow (0.71)
- Technology: