Ma, Yide
Deep Pulse-Coupled Neural Networks
Yi, Zexiang, Lian, Jing, Qi, Yunliang, Yu, Zhaofei, Tang, Huajin, Ma, Yide, Liu, Jizhao
Spiking Neural Networks (SNNs) capture the information processing mechanism of the brain by taking advantage of spiking neurons, such as the Leaky Integrate-and-Fire (LIF) model neuron, which incorporates temporal dynamics and transmits information via discrete and asynchronous spikes. However, the simplified biological properties of LIF ignore the neuronal coupling and dendritic structure of real neurons, which limits the spatio-temporal dynamics of neurons and thus reduce the expressive power of the resulting SNNs. In this work, we leverage a more biologically plausible neural model with complex dynamics, i.e., a pulse-coupled neural network (PCNN), to improve the expressiveness and recognition performance of SNNs for vision tasks. The PCNN is a type of cortical model capable of emulating the complex neuronal activities in the primary visual cortex. We construct deep pulse-coupled neural networks (DPCNNs) by replacing commonly used LIF neurons in SNNs with PCNN neurons. The intra-coupling in existing PCNN models limits the coupling between neurons only within channels. To address this limitation, we propose inter-channel coupling, which allows neurons in different feature maps to interact with each other. Experimental results show that inter-channel coupling can efficiently boost performance with fewer neurons, synapses, and less training time compared to widening the networks. For instance, compared to the LIF-based SNN with wide VGG9, DPCNN with VGG9 uses only 50%, 53%, and 73% of neurons, synapses, and training time, respectively. Furthermore, we propose receptive field and time dependent batch normalization (RFTD-BN) to speed up the convergence and performance of DPCNNs.
Unveiling A Core Linguistic Region in Large Language Models
Zhao, Jun, Zhang, Zhihao, Ma, Yide, Zhang, Qi, Gui, Tao, Gao, Luhui, Huang, Xuanjing
Brain localization, which describes the association between specific regions of the brain and their corresponding functions, is widely accepted in the field of cognitive science as an objective fact. Today's large language models (LLMs) possess human-level linguistic competence and can execute complex tasks requiring abstract knowledge and reasoning. To deeply understand the inherent mechanisms of intelligence emergence in LLMs, this paper conducts an analogical research using brain localization as a prototype. We have discovered a core region in LLMs that corresponds to linguistic competence, accounting for approximately 1% of the total model parameters. This core region exhibits significant dimension dependency, and perturbations to even a single parameter on specific dimensions can lead to a loss of linguistic competence. Furthermore, we observe that an improvement in linguistic competence does not necessarily accompany an elevation in the model's knowledge level, which might imply the existence of regions of domain knowledge that are dissociated from the linguistic region. Overall, exploring the LLMs' functional regions provides insights into the foundation of their intelligence. In the future, we will continue to investigate knowledge regions within LLMs and the interactions between them.