Kumar, Sandeep
Best practices for machine learning in antibody discovery and development
Wossnig, Leonard, Furtmann, Norbert, Buchanan, Andrew, Kumar, Sandeep, Greiff, Victor
Over the past 40 years, the discovery and development of therapeutic antibodies to treat disease has become common practice. However, as therapeutic antibody constructs are becoming more sophisticated (e.g., multi-specifics), conventional approaches to optimisation are increasingly inefficient. Machine learning (ML) promises to open up an in silico route to antibody discovery and help accelerate the development of drug products using a reduced number of experiments and hence cost. Over the past few years, we have observed rapid developments in the field of ML-guided antibody discovery and development (D&D). However, many of the results are difficult to compare or hard to assess for utility by other experts in the field due to the high diversity in the datasets and evaluation techniques and metrics that are across industry and academia. This limitation of the literature curtails the broad adoption of ML across the industry and slows down overall progress in the field, highlighting the need to develop standards and guidelines that may help improve the reproducibility of ML models across different research groups. To address these challenges, we set out in this perspective to critically review current practices, explain common pitfalls, and clearly define a set of method development and evaluation guidelines that can be applied to different types of ML-based techniques for therapeutic antibody D&D. Specifically, we address in an end-to-end analysis, challenges associated with all aspects of the ML process and recommend a set of best practices for each stage.
When Reviewers Lock Horn: Finding Disagreement in Scientific Peer Reviews
Kumar, Sandeep, Ghosal, Tirthankar, Ekbal, Asif
To this date, the efficacy of the scientific publishing enterprise fundamentally rests on the strength of the peer review process. The journal editor or the conference chair primarily relies on the expert reviewers' assessment, identify points of agreement and disagreement and try to reach a consensus to make a fair and informed decision on whether to accept or reject a paper. However, with the escalating number of submissions requiring review, especially in top-tier Artificial Intelligence (AI) conferences, the editor/chair, among many other works, invests a significant, sometimes stressful effort to mitigate reviewer disagreements. Here in this work, we introduce a novel task of automatically identifying contradictions among reviewers on a given article. To this end, we introduce ContraSciView, a comprehensive review-pair contradiction dataset on around 8.5k papers (with around 28k review pairs containing nearly 50k review pair comments) from the open review-based ICLR and NeurIPS conferences. We further propose a baseline model that detects contradictory statements from the review pairs. To the best of our knowledge, we make the first attempt to identify disagreements among peer reviewers automatically. We make our dataset and code public for further investigations.
Free Lunch for Privacy Preserving Distributed Graph Learning
Agrawal, Nimesh, Malik, Nikita, Kumar, Sandeep
Learning on graphs is becoming prevalent in a wide range of applications including social networks, robotics, communication, medicine, etc. These datasets belonging to entities often contain critical private information. The utilization of data for graph learning applications is hampered by the growing privacy concerns from users on data sharing. Existing privacy-preserving methods pre-process the data to extract user-side features, and only these features are used for subsequent learning. Unfortunately, these methods are vulnerable to adversarial attacks to infer private attributes. We present a novel privacy-respecting framework for distributed graph learning and graph-based machine learning. In order to perform graph learning and other downstream tasks on the server side, this framework aims to learn features as well as distances without requiring actual features while preserving the original structural properties of the raw data. The proposed framework is quite generic and highly adaptable. We demonstrate the utility of the Euclidean space, but it can be applied with any existing method of distance approximation and graph learning for the relevant spaces. Through extensive experimentation on both synthetic and real datasets, we demonstrate the efficacy of the framework in terms of comparing the results obtained without data sharing to those obtained with data sharing as a benchmark. This is, to our knowledge, the first privacy-preserving distributed graph learning framework.
Forecasting formation of a Tropical Cyclone Using Reanalysis Data
Kumar, Sandeep, Biswas, Koushik, Pandey, Ashish Kumar
The tropical cyclone formation process is one of the most complex natural phenomena which is governed by various atmospheric, oceanographic, and geographic factors that varies with time and space. Despite several years of research, accurately predicting tropical cyclone formation remains a challenging task. While the existing numerical models have inherent limitations, the machine learning models fail to capture the spatial and temporal dimensions of the causal factors behind TC formation. In this study, a deep learning model has been proposed that can forecast the formation of a tropical cyclone with a lead time of up to 60 hours with high accuracy. The model uses the high-resolution reanalysis data ERA5 (ECMWF reanalysis 5th generation), and best track data IBTrACS (International Best Track Archive for Climate Stewardship) to forecast tropical cyclone formation in six ocean basins of the world. For 60 hours lead time the models achieve an accuracy in the range of 86.9% - 92.9% across the six ocean basins. The model takes about 5-15 minutes of training time depending on the ocean basin, and the amount of data used and can predict within seconds, thereby making it suitable for real-life usage.
SMU: smooth activation function for deep networks using smoothing maximum technique
Biswas, Koushik, Kumar, Sandeep, Banerjee, Shilpak, Pandey, Ashish Kumar
Deep learning researchers have a keen interest in proposing two new novel activation functions which can boost network performance. A good choice of activation function can have significant consequences in improving network performance. A handcrafted activation is the most common choice in neural network models. ReLU is the most common choice in the deep learning community due to its simplicity though ReLU has some serious drawbacks. In this paper, we have proposed a new novel activation function based on approximation of known activation functions like Leaky ReLU, and we call this function Smooth Maximum Unit (SMU). Replacing ReLU by SMU, we have got 6.22% improvement in the CIFAR100 dataset with the ShuffleNet V2 model.
SAU: Smooth activation function using convolution with approximate identities
Biswas, Koushik, Kumar, Sandeep, Banerjee, Shilpak, Pandey, Ashish Kumar
Well-known activation functions like ReLU or Leaky ReLU are non-differentiable at the origin. Over the years, many smooth approximations of ReLU have been proposed using various smoothing techniques. We propose new smooth approximations of a non-differentiable activation function by convolving it with approximate identities. In particular, we present smooth approximations of Leaky ReLU and show that they outperform several well-known activation functions in various datasets and models. We call this function Smooth Activation Unit (SAU). Replacing ReLU by SAU, we get 5.12% improvement with ShuffleNet V2 (2.0x) model on CIFAR100 dataset.
ErfAct and PSerf: Non-monotonic smooth trainable Activation Functions
Biswas, Koushik, Kumar, Sandeep, Banerjee, Shilpak, Pandey, Ashish Kumar
An activation function is a crucial component of a neural network that introduces non-linearity in the network. The state-of-the-art performance of a neural network depends on the perfect choice of an activation function. We propose two novel non-monotonic smooth trainable activation functions, called ErfAct and PSerf. Experiments suggest that the proposed functions improve the network performance significantly compared to the widely used activations like ReLU, Swish, and Mish. Replacing ReLU by ErfAct and PSerf, we have 5.21% and 5.04% improvement for top-1 accuracy on PreactResNet-34 network in CIFAR100 dataset, 2.58% and 2.76% improvement for top-1 accuracy on PreactResNet-34 network in CIFAR10 dataset, 1.0%, and 1.0% improvement on mean average precision (mAP) on SSD300 model in Pascal VOC dataset.
Intensity Prediction of Tropical Cyclones using Long Short-Term Memory Network
Biswas, Koushik, Kumar, Sandeep, Pandey, Ashish Kumar
The weather-related forecast is one of the difficult problems to solve due to the complex interplay between various cause factors. Accurate tropical cyclone intensity prediction is one such problem that has huge importance due to its vast social and economic impact. Cyclones are one of the devastating natural phenomena that frequently occur in tropical regions. Being a tropical region, Indian coastal regions are frequently affected by tropical cyclones [1] that originate into the Arabian Sea (AS) and Bay of Bengal (BOB), which are parts of the North Indian Ocean (NIO). With the increasing frequency of cyclones in NIO [2], it becomes more crucial to develop a model that can forecast the intensity of a cyclone for a longer period of time by observing the cyclone only for a small period of time. Various statistical and numerical methods have been developed to predict the intensity of cyclones [3-7] but all these methods lack effectiveness in terms of accuracy and computation time.
Prediction of Landfall Intensity, Location, and Time of a Tropical Cyclone
Kumar, Sandeep, Biswas, Koushik, Pandey, Ashish Kumar
TC is characterised by warm core, and a low and availability of huge data, new models using Artificial pressure system with a large vortex in the atmosphere. TC Neural Networks (ANNs) have been increasingly used to brings strong winds, heavy precipitation and high tides in forecast track and intensity of cyclones (Leroux et al. 2018; coastal areas and resulted in huge economic and human loss. Alemany et al. 2018; Giffard-Roisin et al. 2020; Moradi Kordmahalleh, Over the years, many destructive TCs have originated in the Gorji Sefidmazgi, and Homaifar 2016). North Indian Ocean (NIO), consisting of the Bay of Bengal The most important prediction about a TC is its arrival at and the Arabian Sea. In 2008, Nargis, one of the disastrous land, known as landfall of a cyclone. The accurate prediction TC in recent times, originated in the Bay of Bengal and resulted about the location and time of the landfall, and intensity of in 13,800 casualties alone in Myanmar and caused the cyclone at the landfall will hugely help authorities to take US$15.4 billion economic loss (Fritz et al. 2009). In 2018, preventive measures and reduce material and human loss. In Fani cyclone caused 89 causalities in India and Bangladesh, this work, we attempt to predict intensity, location, and time and US$9.1 billion economic loss (Kumar, Lal, and Kumar of the landfall of a TC at any instance of time during the 2020).
Predicting Landfall's Location and Time of a Tropical Cyclone Using Reanalysis Data
Kumar, Sandeep, Biswas, Koushik, Pandey, Ashish Kumar
Landfall of a tropical cyclone is the event when it moves over the land after crossing the coast of the ocean. It is important to know the characteristics of the landfall in terms of location and time, well advance in time to take preventive measures timely. In this article, we develop a deep learning model based on the combination of a Convolutional Neural network and a Long Short-Term memory network to predict the landfall's location and time of a tropical cyclone in six ocean basins of the world with high accuracy. We have used high-resolution spacial reanalysis data, ERA5, maintained by European Center for Medium-Range Weather Forecasting (ECMWF). The model takes any 9 hours, 15 hours, or 21 hours of data, during the progress of a tropical cyclone and predicts its landfall's location in terms of latitude and longitude and time in hours. For 21 hours of data, we achieve mean absolute error for landfall's location prediction in the range of 66.18 - 158.92 kilometers and for landfall's time prediction in the range of 4.71 - 8.20 hours across all six ocean basins. The model can be trained in just 30 to 45 minutes (based on ocean basin) and can predict the landfall's location and time in a few seconds, which makes it suitable for real time prediction.