AITopics | Katariya, Namit

Collaborating Authors

Katariya, Namit

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Adding more data does not always help: A study in medical conversation summarization with PEGASUS

Nair, Varun, Katariya, Namit, Amatriain, Xavier, Valmianski, Ilya, Kannan, Anitha

arXiv.org Artificial IntelligenceNov-28-2021

Medical conversation summarization is integral in capturing information gathered during interactions between patients and physicians. Summarized conversations are used to facilitate patient hand-offs between physicians, and as part of providing care in the future. Summaries, however, can be time-consuming to produce and require domain expertise. Modern pre-trained NLP models such as PEGASUS have emerged as capable alternatives to human summarization, reaching state-of-the-art performance on many summarization benchmarks. However, many downstream tasks still require at least moderately sized datasets to achieve satisfactory performance. In this work we (1) explore the effect of dataset size on transfer learning medical conversation summarization using PEGASUS and (2) evaluate various iterative labeling strategies in the low-data regime, following their success in the classification setting. We find that model performance saturates with increase in dataset size and that the various active-learning strategies evaluated all show equivalent performance consistent with simple dataset size increase. We also find that naive iterative pseudo-labeling is on-par or slightly worse than no pseudo-labeling. Our work sheds light on the successes and challenges of translating low-data regime techniques in classification to medical conversation summarization and helps guides future work in this space. Relevant code available at \url{https://github.com/curai/curai-research/tree/main/medical-summarization-ML4H-2021}.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2111.07564

Country: Europe > Spain (0.14)

Genre: Research Report (0.51)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

MEDCOD: A Medically-Accurate, Emotive, Diverse, and Controllable Dialog System

Compton, Rhys, Valmianski, Ilya, Deng, Li, Huang, Costa, Katariya, Namit, Amatriain, Xavier, Kannan, Anitha

arXiv.org Artificial IntelligenceNov-17-2021

We present MEDCOD, a Medically-Accurate, Emotive, Diverse, and Controllable Dialog system with a unique approach to the natural language generator module. MEDCOD has been developed and evaluated specifically for the history taking task. It integrates the advantage of a traditional modular approach to incorporate (medical) domain knowledge with modern deep learning techniques to generate flexible, human-like natural language expressions. Two key aspects of MEDCOD's natural language output are described in detail. First, the generated sentences are emotive and empathetic, similar to how a doctor would communicate to the patient. Second, the generated sentence structures and phrasings are varied and diverse while maintaining medical consistency with the desired medical concept (provided by the dialogue manager module of MEDCOD). Experimental results demonstrate the effectiveness of our approach in creating a human-like medical dialogue system. Relevant code is available at https://github.com/curai/curai-research/tree/main/MEDCOD

machine learning, medcod, natural language, (19 more...)

arXiv.org Artificial Intelligence

2111.09381

Country:

North America (0.14)
Europe (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)
Health & Medicine > Health Care Technology > Telehealth (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Medically Aware GPT-3 as a Data Generator for Medical Dialogue Summarization

Chintagunta, Bharath, Katariya, Namit, Amatriain, Xavier, Kannan, Anitha

arXiv.org Artificial IntelligenceSep-9-2021

In medical dialogue summarization, summaries must be coherent and must capture all the medically relevant information in the dialogue. However, learning effective models for summarization require large amounts of labeled data which is especially hard to obtain. We present an algorithm to create synthetic training data with an explicit focus on capturing medically relevant information. We utilize GPT-3 as the backbone of our algorithm and scale 210 human labeled examples to yield results comparable to using 6400 human labeled examples ( 30x) leveraging low-shot learning and an ensemble method. In detailed experiments, we show that this approach produces high quality training data that can further be combined with human labeled data to get summaries that are strongly preferable to those produced by models trained on human data alone both in terms of medical accuracy and coherency.

artificial intelligence, health & medicine, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2110.07356

Country:

Europe (0.68)
North America > United States (0.28)

Genre: Research Report (0.65)

Industry:

Health & Medicine > Health Care Technology > Telehealth (0.96)
Health & Medicine > Therapeutic Area (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.80)

Add feedback

Dr. Summarize: Global Summarization of Medical Dialogue by Exploiting Local Structures

Joshi, Anirudh, Katariya, Namit, Amatriain, Xavier, Kannan, Anitha

arXiv.org Artificial IntelligenceSep-18-2020

Understanding a medical conversation between a patient and a physician poses a unique natural language understanding challenge since it combines elements of standard open ended conversation with very domain specific elements that require expertise and medical knowledge. Summarization of medical conversations is a particularly important aspect of medical conversation understanding since it addresses a very real need in medical practice: capturing the most important aspects of a medical encounter so that they can be used for medical decision making and subsequent follow ups. In this paper we present a novel approach to medical conversation summarization that leverages the unique and independent local structures created when gathering a patient's medical history. Our approach is a variation of the pointer generator network where we introduce a penalty on the generator distribution, and we explicitly model negations. The model also captures important properties of medical conversations such as medical knowledge coming from standardized medical ontologies better than when those concepts are introduced explicitly. Through evaluation by doctors, we show that our approach is preferred on twice the number of summaries to the baseline pointer generator model and captures most or all of the information in 80% of the conversations making it a realistic alternative to costly manual summarization by medical experts.

neural network, summarization, telehealth, (21 more...)

arXiv.org Artificial Intelligence

2009.08666

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (0.84)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology > Telehealth (0.96)
Health & Medicine > Consumer Health (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Domain-Relevant Embeddings for Medical Question Similarity

McCreery, Clara, Katariya, Namit, Kannan, Anitha, Chablani, Manish, Amatriain, Xavier

arXiv.org Machine LearningOct-9-2019

The rate at which medical questions are asked online significantly exceeds the capacity of qualified people to answer them, leaving many questions unanswered or inadequately answered. Many of these questions are not unique, and reliable identification of similar questions would enable more efficient and effective question answering schema. While many research efforts have focused on the problem of general question similarity, these approaches do not generalize well to the medical domain, where medical expertise is often required to determine semantic similarity. In this paper, we show how a semi-supervised approach of pre-training a neural network on medical question-answer pairs is a particularly useful intermediate task for the ultimate goal of determining medical question similarity. While other pre-training tasks yield an accuracy below 78.7% on this task, our model achieves an accuracy of 82.6% with the same number of training examples, an accuracy of 80.0% with a much smaller training set, and an accuracy of 84.5% when the full corpus of medical question-answer data is used. However, the number of people asking medical questions online far exceeds the number of qualified experts - i.e doctors - answering them. One way to address this imbalance is to build a system that can automatically match unanswered questions with semantically similar answered questions, or mark them as priority if no similar answered questions exist. This approach uses doctor time more efficiently, reducing the number of unanswered questions and lowering the cost of providing online care.

immunology, inductive learning, question pair, (23 more...)

arXiv.org Machine Learning

1910.04192

Country:

North America (0.14)
Europe > Italy (0.14)
Asia > China (0.14)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

Open Set Medical Diagnosis

Prabhu, Viraj, Kannan, Anitha, Tso, Geoffrey J., Katariya, Namit, Chablani, Manish, Sontag, David, Amatriain, Xavier

arXiv.org Artificial IntelligenceOct-7-2019

Machine-learned diagnosis models have shown promise as medical aides but are trained under a closed-set assumption, i.e. that models will only encounter conditions on which they have been trained. However, it is practically infeasible to obtain sufficient training data for every human condition, and once deployed such models will invariably face previously unseen conditions. We frame machine-learned diagnosis as an open-set learning problem, and study how state-of-the-art approaches compare. Further, we extend our study to a setting where training data is distributed across several healthcare sites that do not allow data pooling, and experiment with different strategies of building open-set diagnostic ensembles. Across both settings, we observe consistent gains from explicitly modeling unseen conditions, but find the optimal training strategy to vary across settings.

diagnosis, neural network, telehealth, (19 more...)

arXiv.org Artificial Intelligence

1910.0283

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.95)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback