Neuronal Activation States as Sample Embeddings for Data Selection in Task-Specific Instruction Tuning

Ma, Da, Shang, Gonghu, Chen, Zhi, Qin, Libo, Luo, Yijie, Pan, Lei, Fan, Shuai, Chen, Lu, Yu, Kai

Mar-19-2025–arXiv.org Artificial Intelligence

Task-specific instruction tuning enhances the performance of large language models (LLMs) on specialized tasks, yet efficiently selecting relevant data for this purpose remains a challenge. Inspired by neural coactivation in the human brain, we propose a novel data selection method called NAS, which leverages neuronal activation states as embeddings for samples in the feature space. Extensive experiments show that NAS outperforms classical data selection methods in terms of both effectiveness and robustness across different models, datasets, and selection ratios.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

Mar-19-2025

arXiv.org PDF

Add feedback

Country:
- Oceania > New Zealand
  - South Island > Canterbury Region > Christchurch (0.04)
- North America > United States
  - Minnesota > Hennepin County > Minneapolis (0.14)
- Asia
  - Thailand > Bangkok
    - Bangkok (0.04)
  - Middle East > UAE
    - Abu Dhabi Emirate > Abu Dhabi (0.04)
  - China > Shanghai
    - Shanghai (0.04)

Genre:
- Workflow (0.71)
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found