AITopics | Naik, Dhruv

Plotting

Naik, Dhruv

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Rare Disease Differential Diagnosis with Large Language Models at Scale: From Abdominal Actinomycosis to Wilson's Disease

Schumacher, Elliot, Naik, Dhruv, Kannan, Anitha

arXiv.org Artificial IntelligenceFeb-20-2025

Large language models (LLMs) have demonstrated impressive capabilities in disease diagnosis. However, their effectiveness in identifying rarer diseases, which are inherently more challenging to diagnose, remains an open question. Rare disease performance is critical with the increasing use of LLMs in healthcare settings. This is especially true if a primary care physician needs to make a rarer prognosis from only a patient conversation so that they can take the appropriate next step. To that end, several clinical decision support systems are designed to support providers in rare disease identification. Yet their utility is limited due to their lack of knowledge of common disorders and difficulty of use. In this paper, we propose RareScale to combine the knowledge LLMs with expert systems. We use jointly use an expert system and LLM to simulate rare disease chats. This data is used to train a rare disease candidate predictor model. Candidates from this smaller model are then used as additional inputs to black-box LLM to make the final differential diagnosis. Thus, RareScale allows for a balance between rare and common diagnoses. We present results on over 575 rare diseases, beginning with Abdominal Actinomycosis and ending with Wilson's Disease. Our approach significantly improves the baseline performance of black-box LLMs by over 17% in Top-5 accuracy. We also find that our candidate generation performance is high (e.g. 88.8% on gpt-4o generated chats).

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.15069

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > Mexico > Mexico City (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Extrinsically-Focused Evaluation of Omissions in Medical Summarization

Schumacher, Elliot, Rosenthal, Daniel, Naik, Dhruv, Nair, Varun, Price, Luladay, Tso, Geoffrey, Kannan, Anitha

arXiv.org Artificial IntelligenceNov-11-2024

Large language models (LLMs) have shown promise in safety-critical applications such as healthcare, yet the ability to quantify performance has lagged. An example of this challenge is in evaluating a summary of the patient's medical record. A resulting summary can enable the provider to get a high-level overview of the patient's health status quickly. Yet, a summary that omits important facts about the patient's record can produce a misleading picture. This can lead to negative consequences on medical decision-making. We propose MED-OMIT as a metric to explore this challenge. We focus on using provider-patient history conversations to generate a subjective (a summary of the patient's history) as a case study. We begin by discretizing facts from the dialogue and identifying which are omitted from the subjective. To determine which facts are clinically relevant, we measure the importance of each fact to a simulated differential diagnosis. We compare MED-OMIT's performance to that of clinical experts and find broad agreement We use MED-OMIT to evaluate LLM performance on subjective generation and find some LLMs (gpt-4 and llama-3.1-405b) work well with little effort, while others (e.g. Llama 2) perform worse.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2311.08303

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Diagnostic Medicine (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback