Krishnaswamy, Pavitra
Deep Offline Reinforcement Learning for Real-world Treatment Optimization Applications
Nambiar, Milashini, Ghosh, Supriyo, Ong, Priscilla, Chan, Yu En, Bee, Yong Mong, Krishnaswamy, Pavitra
There is increasing interest in data-driven approaches for recommending optimal treatment strategies in many chronic disease management and critical care applications. Reinforcement learning methods are well-suited to this sequential decision-making problem, but must be trained and evaluated exclusively on retrospective medical record datasets as direct online exploration is unsafe and infeasible. Despite this requirement, the vast majority of treatment optimization studies use off-policy RL methods (e.g., Double Deep Q Networks (DDQN) or its variants) that are known to perform poorly in purely offline settings. Recent advances in offline RL, such as Conservative Q-Learning (CQL), offer a suitable alternative. But there remain challenges in adapting these approaches to real-world applications where suboptimal examples dominate the retrospective dataset and strict safety constraints need to be satisfied. In this work, we introduce a practical and theoretically grounded transition sampling approach to address action imbalance during offline RL training. We perform extensive experiments on two real-world tasks for diabetes and sepsis treatment optimization to compare performance of the proposed approach against prominent off-policy and offline RL baselines (DDQN and CQL). Across a range of principled and clinically relevant metrics, we show that our proposed approach enables substantial improvements in expected health outcomes and in accordance with relevant practice and safety guidelines.
Semi-supervised classification of radiology images with NoTeacher: A Teacher that is not Mean
Unnikrishnan, Balagopal, Nguyen, Cuong, Balaram, Shafa, Li, Chao, Foo, Chuan Sheng, Krishnaswamy, Pavitra
Deep learning models achieve strong performance for radiology image classification, but their practical application is bottlenecked by the need for large labeled training datasets. Semi-supervised learning (SSL) approaches leverage small labeled datasets alongside larger unlabeled datasets and offer potential for reducing labeling cost. In this work, we introduce NoTeacher, a novel consistency-based SSL framework which incorporates probabilistic graphical models. Unlike Mean Teacher which maintains a teacher network updated via a temporal ensemble, NoTeacher employs two independent networks, thereby eliminating the need for a teacher network. We demonstrate how NoTeacher can be customized to handle a range of challenges in radiology image classification. Specifically, we describe adaptations for scenarios with 2D and 3D inputs, uni and multi-label classification, and class distribution mismatch between labeled and unlabeled portions of the training data. In realistic empirical evaluations on three public benchmark datasets spanning the workhorse modalities of radiology (X-Ray, CT, MRI), we show that NoTeacher achieves over 90-95% of the fully supervised AUROC with less than 5-15% labeling budget. Further, NoTeacher outperforms established SSL methods with minimal hyperparameter tuning, and has implications as a principled and practical option for semisupervised learning in radiology applications.
Joint Learning of Word and Label Embeddings for Sequence Labelling in Spoken Language Understanding
Wu, Jiewen, D'Haro, Luis Fernando, Chen, Nancy F., Krishnaswamy, Pavitra, Banchs, Rafael E.
Palo Alto, CA, 94306, USA ABSTRACT We propose an architecture to jointly learn word and label embeddings for slot filling in spoken language understanding. The proposed approach encodes labels using a combination of word embeddings and straightforward word-label association from the training data. Compared to the state-of- the-art methods, our approach does not require label embed-dings as part of the input and therefore lends itself nicely to a wide range of model architectures. In addition, our architecture computes contextual distances between words and labels to avoid adding contextual windows, thus reducing memory footprint. We validate the approach on established spoken dialogue datasets and show that it can achieve state-of-the-art performance with much fewer trainable parameters. Index T erms-- Slot-filling, recurrent neural network, distributional semantics, sequence labelling 1. INTRODUCTION In spoken language understanding (SLU), an essential step is to associate each word in an utterance with one semantic class label. These annotated utterances can then serve as a basis for higher level SLU tasks, such as topic identification and dialogue response generation. This process of semantic label tagging in SLU, dubbed slot filling, labels utterance sequences with tags under a specific scheme. As an example, the BIO scheme prefixes tags with one of the characters { B, I, O } to indicate the continuity of a tag: Begin, Inside, or Outside, e.g., B-price indicates this position is the beginning of the tag price. Researchers also developed deep learning architecture for slot filling, e.g., [1, 2, 3].
Online Deep Learning: Growing RBM on the fly
Ramasamy, Savitha, Rajaraman, Kanagasabai, Krishnaswamy, Pavitra, Chandrasekhar, Vijay
We propose a novel online learning algorithm for Restricted Boltzmann Machines (RBM), namely, the Online Generative Discriminative Restricted Boltzmann Machine (OGD-RBM), that provides the ability to build and adapt the network architecture of RBM according to the statistics of streaming data. The OGD-RBM is trained in two phases: (1) an online generative phase for unsupervised feature representation at the hidden layer and (2) a discriminative phase for classification. The online generative training begins with zero neurons in the hidden layer, adds and updates the neurons to adapt to statistics of streaming data in a single pass unsupervised manner, resulting in a feature representation best suited to the data. The discriminative phase is based on stochastic gradient descent and associates the represented features to the class labels. We demonstrate the OGD-RBM on a set of multi-category and binary classification problems for data sets having varying degrees of class-imbalance. We first apply the OGD-RBM algorithm on the multi-class MNIST dataset to characterize the network evolution. We demonstrate that the online generative phase converges to a stable, concise network architecture, wherein individual neurons are inherently discriminative to the class labels despite unsupervised training. We then benchmark OGD-RBM performance to other machine learning, neural network and ClassRBM techniques for credit scoring applications using 3 public non-stationary two-class credit datasets with varying degrees of class-imbalance. We report that OGD-RBM improves accuracy by 2.5-3% over batch learning techniques while requiring at least 24%-70% fewer neurons and fewer training samples. This online generative training approach can be extended greedily to multiple layers for training Deep Belief Networks in non-stationary data mining applications without the need for a priori fixed architectures.