Goto

Collaborating Authors

 Gupta, Shashi Kant


PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching using Large Language Models

arXiv.org Artificial Intelligence

Clinical trial matching is the task of identifying trials for which patients may be potentially eligible. Typically, this task is labor-intensive and requires detailed verification of patient electronic health records (EHRs) against the stringent inclusion and exclusion criteria of clinical trials. This process is manual, time-intensive, and challenging to scale up, resulting in many patients missing out on potential therapeutic options. Recent advancements in Large Language Models (LLMs) have made automating patient-trial matching possible, as shown in multiple concurrent research studies. However, the current approaches are confined to constrained, often synthetic datasets that do not adequately mirror the complexities encountered in real-world medical data. In this study, we present the first, end-to-end large-scale empirical evaluation of clinical trial matching using real-world EHRs. Our study showcases the capability of LLMs to accurately match patients with appropriate clinical trials. We perform experiments with proprietary LLMs, including GPT-4 and GPT-3.5, as well as our custom fine-tuned model called OncoLLM and show that OncoLLM, despite its significantly smaller size, not only outperforms GPT-3.5 but also matches the performance of qualified medical doctors. All experiments were carried out on real-world EHRs that include clinical notes and available clinical trials from a single cancer center in the United States.


Onco-Retriever: Generative Classifier for Retrieval of EHR Records in Oncology

arXiv.org Artificial Intelligence

Retrieving information from EHR systems is essential for answering specific questions about patient journeys and improving the delivery of clinical care. Despite this fact, most EHR systems still rely on keyword-based searches. With the advent of generative large language models (LLMs), retrieving information can lead to better search and summarization capabilities. Such retrievers can also feed Retrieval-augmented generation (RAG) pipelines to answer any query. However, the task of retrieving information from EHR real-world clinical data contained within EHR systems in order to solve several downstream use cases is challenging due to the difficulty in creating query-document support pairs. We provide a blueprint for creating such datasets in an affordable manner using large language models. Our method results in a retriever that is 30-50 F-1 points better than propriety counterparts such as Ada and Mistral for oncology data elements. We further compare our model, called Onco-Retriever, against fine-tuned PubMedBERT model as well. We conduct an extensive manual evaluation on real-world EHR data along with latency analysis of the different models and provide a path forward for healthcare organizations to build domain-specific retrievers.


Investigating Emotion-Color Association in Deep Neural Networks

arXiv.org Artificial Intelligence

It has been found that representations learned by Deep Neural Networks (DNNs) correlate very well to neural responses measured in primates' brains and psychological representations exhibited by human similarity judgment. On another hand, past studies have shown that particular colors can be associated with specific emotion arousal in humans. Do deep neural networks also learn this behavior? In this study, we investigate if DNNs can learn implicit associations in stimuli, particularly, an emotion-color association between image stimuli. Our study was conducted in two parts. First, we collected human responses on a forced-choice decision task in which subjects were asked to select a color for a specified emotion-inducing image. Next, we modeled this decision task on neural networks using the similarity between deep representation (extracted using DNNs trained on object classification tasks) of the images and images of colors used in the task. We found that our model showed a fuzzy linear relationship between the two decision probabilities. This results in two interesting findings, 1. The representations learned by deep neural networks can indeed show an emotion-color association 2. The emotion-color association is not just random but involves some cognitive phenomena. Finally, we also show that this method can help us in the emotion classification task, specifically when there are very few examples to train the model. This analysis can be relevant to psychologists studying emotion-color associations and artificial intelligence researchers modeling emotional intelligence in machines or studying representations learned by deep neural networks.