Gupta, Shubh
Framework for Co-distillation Driven Federated Learning to Address Class Imbalance in Healthcare
Racha, Suraj, Gupta, Shubh, Firdowse, Humaira, Solanki, Aastik, Ramakrishnan, Ganesh, Jadhav, Kshitij S.
Federated Learning (FL) is a pioneering approach in distributed machine learning, enabling collaborative model training across multiple clients while retaining data privacy. However, the inherent heterogeneity due to imbalanced resource representations across multiple clients poses significant challenges, often introducing bias towards the majority class. This issue is particularly prevalent in healthcare settings, where hospitals acting as clients share medical images. To address class imbalance and reduce bias, we propose a co-distillation driven framework in a federated healthcare setting. Unlike traditional federated setups with a designated server client, our framework promotes knowledge sharing among clients to collectively improve learning outcomes. Our experiments demonstrate that in a federated healthcare setting, co-distillation outperforms other federated methods in handling class imbalance. Additionally, we demonstrate that our framework has the least standard deviation with increasing imbalance while outperforming other baselines, signifying the robustness of our framework for FL in healthcare.
Neural Elevation Models for Terrain Mapping and Path Planning
Dai, Adam, Gupta, Shubh, Gao, Grace
This work introduces Neural Elevations Models (NEMos), which adapt Neural Radiance Fields to a 2.5D continuous and differentiable terrain model. In contrast to traditional terrain representations such as digital elevation models, NEMos can be readily generated from imagery, a low-cost data source, and provide a lightweight representation of terrain through an implicit continuous and differentiable height field. We propose a novel method for jointly training a height field and radiance field within a NeRF framework, leveraging quantile regression. Additionally, we introduce a path planning algorithm that performs gradient-based optimization of a continuous cost function for minimizing distance, slope changes, and control effort, enabled by differentiability of the height field. We perform experiments on simulated and real-world terrain imagery, demonstrating NEMos ability to generate high-quality reconstructions and produce smoother paths compared to discrete path planning methods. Future work will explore the incorporation of features and semantics into the height field, creating a generalized terrain model.
BOLT: An Automated Deep Learning Framework for Training and Deploying Large-Scale Search and Recommendation Models on Commodity CPU Hardware
Meisburger, Nicholas, Lakshman, Vihan, Geordie, Benito, Engels, Joshua, Ramos, David Torres, Pranav, Pratik, Coleman, Benjamin, Meisburger, Benjamin, Gupta, Shubh, Adunukota, Yashwanth, Medini, Tharun, Shrivastava, Anshumali
Efficient large-scale neural network training and inference on commodity CPU hardware is of immense practical significance in democratizing deep learning (DL) capabilities. Presently, the process of training massive models consisting of hundreds of millions to billions of parameters requires the extensive use of specialized hardware accelerators, such as GPUs, which are only accessible to a limited number of institutions with considerable financial resources. Moreover, there is often an alarming carbon footprint associated with training and deploying these models. In this paper, we take a step towards addressing these challenges by introducing BOLT, a sparse deep learning library for training large-scale search and recommendation models on standard CPU hardware. BOLT provides a flexible, high-level API for constructing models that will be familiar to users of existing popular DL frameworks. By automatically tuning specialized hyperparameters, BOLT also abstracts away the algorithmic details of sparse network training. We evaluate BOLT on a number of information retrieval tasks including product recommendations, text classification, graph neural networks, and personalization. We find that our proposed system achieves competitive performance with state-of-the-art techniques at a fraction of the cost and energy consumption and an order-of-magnitude faster inference time. BOLT has also been successfully deployed by multiple businesses to address critical problems, and we highlight one customer case study in the field of e-commerce.
A Deep Reinforcement Learning Approach to Rare Event Estimation
Corso, Anthony, Kim, Kyu-Young, Gupta, Shubh, Gao, Grace, Kochenderfer, Mykel J.
An important step in the design of autonomous systems is to evaluate the probability that a failure will occur. In safety-critical domains, the failure probability is extremely small so that the evaluation of a policy through Monte Carlo sampling is inefficient. Adaptive importance sampling approaches have been developed for rare event estimation but do not scale well to sequential systems with long horizons. In this work, we develop two adaptive importance sampling algorithms that can efficiently estimate the probability of rare events for sequential decision making systems. The basis for these algorithms is the minimization of the Kullback-Leibler divergence between a state-dependent proposal distribution and a target distribution over trajectories, but the resulting algorithms resemble policy gradient and value-based reinforcement learning. We apply multiple importance sampling to reduce the variance of our estimate and to address the issue of multi-modality in the optimal proposal distribution. We demonstrate our approach on a control task with both continuous and discrete actions spaces and show accuracy improvements over several baselines.