Discovering Salient Neurons in Deep NLP Models
Durrani, Nadir, Dalvi, Fahim, Sajjad, Hassan
–arXiv.org Artificial Intelligence
While a lot of work has been done in understanding representations learned within deep NLP models and what knowledge they capture, little attention has been paid towards individual neurons. We present a technique called as Linguistic Correlation Analysis to extract salient neurons in the model, with respect to any extrinsic property - with the goal of understanding how such a knowledge is preserved within neurons. We carry out a fine-grained analysis to answer the following questions: (i) can we identify subsets of neurons in the network that capture specific linguistic properties? (ii) how localized or distributed neurons are across the network? iii) how redundantly is the information preserved? iv) how fine-tuning pre-trained models towards downstream NLP tasks, impacts the learned linguistic knowledge? iv) how do architectures vary in learning different linguistic properties? Our data-driven, quantitative analysis illuminates interesting findings: (i) we found small subsets of neurons that can predict different linguistic tasks, ii) with neurons capturing basic lexical information (such as suffixation) localized in lower most layers, iii) while those learning complex concepts (such as syntactic role) predominantly in middle and higher layers, iii) that salient linguistic neurons are relocated from higher to lower layers during transfer learning, as the network preserve the higher layers for task specific information, iv) we found interesting differences across pre-trained models, with respect to how linguistic information is preserved within, and v) we found that concept exhibit similar neuron distribution across different languages in the multilingual transformer models. Our code is publicly available as part of the NeuroX toolkit.
arXiv.org Artificial Intelligence
Jan-14-2024
- Country:
- Asia
- China > Hong Kong (0.04)
- Middle East
- Iraq > Al Anbar Governorate (0.04)
- Qatar > Ad-Dawhah
- Doha (0.04)
- Republic of Türkiye > Istanbul Province
- Istanbul (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.14)
- Pakistan > Islamabad Capital Territory
- Islamabad (0.04)
- Singapore (0.04)
- Sri Lanka > North Central Province
- Anuradhapura District > Anuradhapura (0.04)
- Polonnaruwa District > Polonnaruwa (0.04)
- Europe
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- Sweden > Vaestra Goetaland
- Gothenburg (0.04)
- Switzerland > Basel-City
- Basel (0.04)
- Spain
- Catalonia > Barcelona Province
- Barcelona (0.04)
- Valencian Community > Valencia Province
- Valencia (0.04)
- Catalonia > Barcelona Province
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Germany
- Italy > Tuscany
- Florence (0.04)
- Hungary > Budapest
- Budapest (0.04)
- Belgium > Brussels-Capital Region
- North America
- Canada
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.14)
- Nova Scotia > Halifax Regional Municipality
- Halifax (0.04)
- Ontario > Toronto (0.04)
- British Columbia > Metro Vancouver Regional District
- United States
- California
- San Diego County > San Diego (0.04)
- San Francisco County > San Francisco (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Maryland > Baltimore (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New York (0.04)
- Ohio (0.04)
- Texas > Travis County
- Austin (0.14)
- Washington > King County
- Seattle (0.14)
- California
- Canada
- Oceania > Australia
- New South Wales > Sydney (0.04)
- Asia
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Government (0.67)
- Technology: