inhibition mechanism
Neural Inhibition Improves Dynamic Routing and Mixture of Experts
Zou, Will Y., Zhang, Jennifer Y.
To be effective, efficient, and diverse, deep learning models need to dynamically choose its architecture based on signals from a population of neurons. We hypothesize dynamic routing models can be improved with neural inhibition in those neural populations. This means signals commonly shared among the various modes of data statistics can be inhibited so that the routing model can choose a specialized expert path for each data sample. Only through inhibition is the routing mechanism able to effectively select neural pathways. We believe this is an under-studied and under-verified implementation methodology for Mixture-of-Experts, dynamic routing, and transformer language models. We provide experimental evidence that the neural inhibition algorithm significantly boosts the performance of general tasks and motivates more effort to be invested in this research direction.
giMLPs: Gate with Inhibition Mechanism in MLPs
Kang, Cheng, Prokop, Jindich, Tong, Lei, Zhou, Huiyu, Hu, Yong, Novak, Daneil
This paper presents a new model architecture, gate with inhibition MLP (giMLP).The gate with inhibition on CycleMLP (gi-CycleMLP) can produce equal performance on the ImageNet classification task, and it also improves the BERT, Roberta, and DeBERTaV3 models depending on two novel techniques. The first is the gating MLP, where matrix multiplications between the MLP and the trunk Attention input in further adjust models' adaptation. The second is inhibition which inhibits or enhances the branch adjustment, and with the inhibition levels increasing, it offers models more muscular features restriction. We show that the giCycleMLP with a lower inhibition level can be competitive with the original CycleMLP in terms of ImageNet classification accuracy. In addition, we also show through a comprehensive empirical study that these techniques significantly improve the performance of fine-tuning NLU downstream tasks. As for the gate with inhibition MLPs on DeBERTa (giDeBERTa) fine-tuning, we find it can achieve appealing results on most parts of NLU tasks without any extra pretraining again. We also find that with the use of Gate With Inhibition, the activation function should have a short and smooth negative tail, with which the unimportant features or the features that hurt models can be moderately inhibited. The experiments on ImageNet and twelve language downstream tasks demonstrate the effectiveness of Gate With Inhibition, both for image classification and for enhancing the capacity of nature language fine-tuning without any extra pretraining.
Agent-based model using GPS analysis for infection spread and inhibition mechanism of SARS-CoV-2 in Tokyo
Murakami, Taishu, Sakuragi, Shunsuke, Deguchi, Hiroshi, Nakata, Masaru
Analyzing the SARS-CoV-2 pandemic outbreak based on actual data while reflecting the characteristics of the real city provides beneficial information for taking reasonable infection control measures in the future. We demonstrate agent-based modeling for Tokyo based on GPS information and official national statistics and perform a spatiotemporal analysis of the infection situation in Tokyo. As a result of the simulation during the first wave of SARS-CoV-2 in Tokyo using real GPS data, the infection occurred in the service industry, such as restaurants, in the city center, and then the infected people brought back the virus to the residential area; the infection spread in each area in Tokyo. This phenomenon clarifies that the spread of infection can be curbed by suppressing going out or strengthening infection prevention measures in service facilities. It was shown that pandemic measures in Tokyo could be achieved not only by strong control, such as the lockdown of cities, but also by thorough infection prevention measures in service facilities, which explains the curb phenomena in real Tokyo.
Learning by Active Forgetting for Neural Networks
Peng, Jian, Sun, Xian, Deng, Min, Tao, Chao, Tang, Bo, Li, Wenbo, Wu, Guohua, QingZhu, null, Liu, Yu, Lin, Tao, Li, Haifeng
Remembering and forgetting mechanisms are two sides of the same coin in a human learning-memory system. Inspired by human brain memory mechanisms, modern machine learning systems have been working to endow machine with lifelong learning capability through better remembering while pushing the forgetting as the antagonist to overcome. Nevertheless, this idea might only see the half picture. Up until very recently, increasing researchers argue that a brain is born to forget, i.e., forgetting is a natural and active process for abstract, rich, and flexible representations. This paper presents a learning model by active forgetting mechanism with artificial neural networks. The active forgetting mechanism (AFM) is introduced to a neural network via a "plug-and-play" forgetting layer (P\&PF), consisting of groups of inhibitory neurons with Internal Regulation Strategy (IRS) to adjust the extinction rate of themselves via lateral inhibition mechanism and External Regulation Strategy (ERS) to adjust the extinction rate of excitatory neurons via inhibition mechanism. Experimental studies have shown that the P\&PF offers surprising benefits: self-adaptive structure, strong generalization, long-term learning and memory, and robustness to data and parameter perturbation. This work sheds light on the importance of forgetting in the learning process and offers new perspectives to understand the underlying mechanisms of neural networks.