AITopics | Banerjee, Shilpak

Collaborating Authors

Banerjee, Shilpak

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SMU: smooth activation function for deep networks using smoothing maximum technique

Biswas, Koushik, Kumar, Sandeep, Banerjee, Shilpak, Pandey, Ashish Kumar

arXiv.org Artificial IntelligenceNov-8-2021

Deep learning researchers have a keen interest in proposing two new novel activation functions which can boost network performance. A good choice of activation function can have significant consequences in improving network performance. A handcrafted activation is the most common choice in neural network models. ReLU is the most common choice in the deep learning community due to its simplicity though ReLU has some serious drawbacks. In this paper, we have proposed a new novel activation function based on approximation of known activation functions like Leaky ReLU, and we call this function Smooth Maximum Unit (SMU). Replacing ReLU by SMU, we have got 6.22% improvement in the CIFAR100 dataset with the ShuffleNet V2 model.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

2111.04682

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SAU: Smooth activation function using convolution with approximate identities

Biswas, Koushik, Kumar, Sandeep, Banerjee, Shilpak, Pandey, Ashish Kumar

arXiv.org Artificial IntelligenceSep-27-2021

Well-known activation functions like ReLU or Leaky ReLU are non-differentiable at the origin. Over the years, many smooth approximations of ReLU have been proposed using various smoothing techniques. We propose new smooth approximations of a non-differentiable activation function by convolving it with approximate identities. In particular, we present smooth approximations of Leaky ReLU and show that they outperform several well-known activation functions in various datasets and models. We call this function Smooth Activation Unit (SAU). Replacing ReLU by SAU, we get 5.12% improvement with ShuffleNet V2 (2.0x) model on CIFAR100 dataset.

activation function, deep learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2109.1321

Country:

North America > United States (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > Italy (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ErfAct and PSerf: Non-monotonic smooth trainable Activation Functions

Biswas, Koushik, Kumar, Sandeep, Banerjee, Shilpak, Pandey, Ashish Kumar

arXiv.org Artificial IntelligenceSep-19-2021

An activation function is a crucial component of a neural network that introduces non-linearity in the network. The state-of-the-art performance of a neural network depends on the perfect choice of an activation function. We propose two novel non-monotonic smooth trainable activation functions, called ErfAct and PSerf. Experiments suggest that the proposed functions improve the network performance significantly compared to the widely used activations like ReLU, Swish, and Mish. Replacing ReLU by ErfAct and PSerf, we have 5.21% and 5.04% improvement for top-1 accuracy on PreactResNet-34 network in CIFAR100 dataset, 2.58% and 2.76% improvement for top-1 accuracy on PreactResNet-34 network in CIFAR10 dataset, 1.0%, and 1.0% improvement on mean average precision (mAP) on SSD300 model in Pascal VOC dataset.

deep learning, erfact and pserf, neural network, (14 more...)

arXiv.org Artificial Intelligence

2109.04386

Country:

North America > United States (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > Italy (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Orthogonal-Pad\'e Activation Functions: Trainable Activation functions for smooth and faster convergence in deep networks

Biswas, Koushik, Banerjee, Shilpak, Pandey, Ashish Kumar

arXiv.org Artificial IntelligenceJun-17-2021

Deep networks are constructed with multiple hidden layers and neurons. Non-linearity is introduced in the network via activation function in each neuron. ReLU [1] is proposed by Nair and Hinton and is the favourite activation in the deep learning community due to its simplicity. Though ReLU has a drawback called dying ReLU, and in this case, up to 50% neurons can be dead due to vanishing gradient problem, i.e. there are numerous neurons which has no impact on the network performance. To overcome this problem, later Leaky Relu [2], Parametric ReLU [3], ELU [4], Softplus [5] was proposed, and they have improved the network performance though it's still an open problem for researchers to find the best activation function. Recently Swish [6] was found by a group of researchers from Google brain, and they used automated searching technique. Swish has shown some improvement in accuracy over ReLU.

activation function, deep learning, neural network, (13 more...)

arXiv.org Artificial Intelligence

2106.09693

Country:

North America > United States (0.28)
North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

TanhSoft -- a family of activation functions combining Tanh and Softplus

Biswas, Koushik, Kumar, Sandeep, Banerjee, Shilpak, Pandey, Ashish Kumar

arXiv.org Artificial IntelligenceSep-8-2020

Artificial neural networks (ANNs) have occupied the center stage in the realm of deep learning in the recent past. ANNs are made up of several hidden layers, and each hidden layer consists of several neurons. At each neuron, an affine linear map is composed with a nonlinear function known as activation function. During the training of an ANN, the linear map is optimized, however an activation function is usually fixed in the beginning along with the architecture of the ANN. There has been an increasing interest in developing a methodical understanding of activation functions, in particular with regards to the construction of novel activation functions and identifying mathematical properties leading to a better learning [1]. An activation function is considered good if it can increase the learning rate and leaning to better convergence which leads to more accurate results. At the early stage of deep learning research, researchers used shallow networks (fewer hidden layers), and tanh or sigmoid, were used as activation functions. As the research progressed and deeper networks (multiple hidden layers) came into fashion to achieve challenging tasks, Rectified Linear Unit (ReLU)([2], [3], [4]) emerged as the most popular activation function. Despite its simplicity, deep neural networks with ReLU have learned many complex and highly nonlinear functions with high accuracy.

activation function, deep learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

2009.03863

Country:

North America > United States (0.68)
Asia > India > NCT (0.15)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback