Goto

Collaborating Authors

 Bhattacharyya, Mayukh


Deciphering Environmental Air Pollution with Large Scale City Data

arXiv.org Artificial Intelligence

Out of the numerous hazards posing a threat to sustainable environmental conditions in the 21st century, only a few have a graver impact than air pollution. Its importance in determining the health and living standards in urban settings is only expected to increase with time. Various factors ranging from emissions from traffic and power plants, household emissions, natural causes are known to be primary causal agents or influencers behind rising air pollution levels. However, the lack of large scale data involving the major factors has hindered the research on the causes and relations governing the variability of the different air pollutants. Through this work, we introduce a large scale city-wise dataset for exploring the relationships among these agents over a long period of time. We analyze and explore the dataset to bring out inferences which we can derive by modeling the data. Also, we provide a set of benchmarks for the problem of estimating or forecasting pollutant levels with a set of diverse models and methodologies. Through our paper, we seek to provide a ground base for further research into this domain that will demand critical attention of ours in the near future.


SERF: Towards better training of deep neural networks using log-Softplus ERror activation Function

arXiv.org Artificial Intelligence

Activation functions play a pivotal role in determining the training dynamics and neural network performance. The widely adopted activation function ReLU despite being simple and effective has few disadvantages including the Dying ReLU problem. In order to tackle such problems, we propose a novel activation function called Serf which is self-regularized and nonmonotonic in nature. Like Mish, Serf also belongs to the Swish family of functions. Based on several experiments on computer vision (image classification and object detection) and natural language processing (machine translation, sentiment classification and multimodal entailment) tasks with different state-of-the-art architectures, it is observed that Serf vastly outperforms ReLU (baseline) and other activation functions including both Swish and Mish, with a markedly bigger margin on deeper architectures. Ablation studies further demonstrate that Serf based architectures perform better than those of Swish and Mish in varying scenarios, validating the effectiveness and compatibility of Serf with varying depth, complexity, optimizers, learning rates, batch sizes, initializers and dropout rates. Finally, we investigate the mathematical relation between Swish and Serf, thereby showing the impact of preconditioner function ingrained in the first derivative of Serf which provides a regularization effect making gradients smoother and optimization faster.