AITopics | McInerney, Denis Jered

Collaborating Authors

McInerney, Denis Jered

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Open (Clinical) LLMs are Sensitive to Instruction Phrasings

Arroyo, Alberto Mario Ceballos, Munnangi, Monica, Sun, Jiuding, Zhang, Karen Y. C., McInerney, Denis Jered, Wallace, Byron C., Amir, Silvio

arXiv.org Artificial IntelligenceJul-12-2024

Instruction-tuned Large Language Models (LLMs) can perform a wide range of tasks given natural language instructions to do so, but they are sensitive to how such instructions are phrased. This issue is especially concerning in healthcare, as clinicians are unlikely to be experienced prompt engineers and the potential consequences of inaccurate outputs are heightened in this domain. This raises a practical question: How robust are instruction-tuned LLMs to natural variations in the instructions provided for clinical NLP tasks? We collect prompts from medical doctors across a range of tasks and quantify the sensitivity of seven LLMs -- some general, others specialized -- to natural (i.e., non-adversarial) instruction phrasings. We find that performance varies substantially across all models, and that -- perhaps surprisingly -- domain-specific models explicitly trained on clinical data are especially brittle, compared to their general domain counterparts. Further, arbitrary phrasing differences can affect fairness, e.g., valid but distinct instructions for mortality prediction yield a range both in overall performance, and in terms of differences between demographic groups.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2407.09429

Country:

Africa (0.93)
North America > Canada (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.71)
Health & Medicine > Health Care Technology (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Towards Reducing Diagnostic Errors with Interpretable Risk Prediction

McInerney, Denis Jered, Dickinson, William, Flynn, Lucy, Young, Andrea, Young, Geoffrey, van de Meent, Jan-Willem, Wallace, Byron C.

arXiv.org Artificial IntelligenceFeb-15-2024

Many diagnostic errors occur because clinicians cannot easily access relevant information in patient Electronic Health Records (EHRs). In this work we propose a method to use LLMs to identify pieces of evidence in patient EHR data that indicate increased or decreased risk of specific diagnoses; our ultimate aim is to increase access to evidence and reduce diagnostic errors. In particular, we propose a Neural Additive Model to make predictions backed by evidence with individualized risk estimates at time-points where clinicians are still uncertain, aiming to specifically mitigate delays in diagnosis and errors stemming from an incomplete differential. To train such a model, it is necessary to infer temporally fine-grained retrospective labels of eventual "true" diagnoses. We do so with LLMs, to ensure that the input text is from before a confident diagnosis can be made. We use an LLM to retrieve an initial pool of evidence, but then refine this set of evidence according to correlations learned by the model. We conduct an in-depth evaluation of the usefulness of our approach by simulating how it might be used by a clinician to decide between a pre-defined list of differential diagnoses.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2402.10109

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Leveraging Generative AI for Clinical Evidence Summarization Needs to Ensure Trustworthiness

Zhang, Gongbo, Jin, Qiao, McInerney, Denis Jered, Chen, Yong, Wang, Fei, Cole, Curtis L., Yang, Qian, Wang, Yanshan, Malin, Bradley A., Peleg, Mor, Wallace, Byron C., Lu, Zhiyong, Weng, Chunhua, Peng, Yifan

arXiv.org Artificial IntelligenceJan-26-2024

Evidence-based medicine promises to improve the quality of healthcare by empowering medical decisions and practices with the best available evidence. The rapid growth of medical evidence, which can be obtained from various sources, poses a challenge in collecting, appraising, and synthesizing the evidential information. Recent advancements in generative AI, exemplified by large language models, hold promise in facilitating the arduous task. However, developing accountable, fair, and inclusive models remains a complicated undertaking. In this perspective, we discuss the trustworthiness of generative AI in the context of automated summarization of medical evidence.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2311.11211

Country:

North America > United States > New York (0.14)
Asia > Middle East > Israel (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.69)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.93)

Add feedback

CHiLL: Zero-shot Custom Interpretable Feature Extraction from Clinical Notes with Large Language Models

McInerney, Denis Jered, Young, Geoffrey, van de Meent, Jan-Willem, Wallace, Byron C.

arXiv.org Artificial IntelligenceOct-19-2023

We propose CHiLL (Crafting High-Level Latents), an approach for natural-language specification of features for linear models. CHiLL prompts LLMs with expert-crafted queries to generate interpretable features from health records. The resulting noisy labels are then used to train a simple linear classifier. Generating features based on queries to an LLM can empower physicians to use their domain expertise to craft features that are clinically meaningful for a downstream task of interest, without having to manually extract these from raw EHR. We are motivated by a real-world risk prediction task, but as a reproducible proxy, we use MIMIC-III and MIMIC-CXR data and standard predictive tasks (e.g., 30-day readmission) to evaluate this approach. We find that linear models using automatically extracted features are comparably performant to models using reference features, and provide greater interpretability than linear models using "Bag-of-Words" features. We verify that learned feature weights align well with clinical expectations.

large language model, machine learning, opacity, (21 more...)

arXiv.org Artificial Intelligence

2302.12343

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Therapeutic Area > Nephrology (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Retrieving Evidence from EHRs with LLMs: Possibilities and Challenges

Ahsan, Hiba, McInerney, Denis Jered, Kim, Jisoo, Potter, Christopher, Young, Geoffrey, Amir, Silvio, Wallace, Byron C.

arXiv.org Artificial IntelligenceSep-8-2023

Unstructured Electronic Health Record (EHR) data often contains critical information complementary to imaging data that would inform radiologists' diagnoses. However, time constraints and the large volume of notes frequently associated with individual patients renders manual perusal of such data to identify relevant evidence infeasible in practice. Modern Large Language Models (LLMs) provide a flexible means of interacting with unstructured EHR data, and may provide a mechanism to efficiently retrieve and summarize unstructured evidence relevant to a given query. In this work, we propose and evaluate an LLM (Flan-T5 XXL) for this purpose. Specifically, in a zero-shot setting we task the LLM to infer whether a patient has or is at risk of a particular condition; if so, we prompt the model to summarize the supporting evidence. Enlisting radiologists for manual evaluation, we find that this LLM-based approach provides outputs consistently preferred to a standard information retrieval baseline, but we also highlight the key outstanding challenge: LLMs are prone to hallucinating evidence. However, we provide results indicating that model confidence in outputs might indicate when LLMs are hallucinating, potentially providing a means to address this.

artificial intelligence, large language model, possibility and challenge, (4 more...)

arXiv.org Artificial Intelligence

2309.0455

Genre: Research Report (0.40)

Industry: Health & Medicine > Health Care Technology > Medical Record (0.53)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Automatically Summarizing Evidence from Clinical Trials: A Prototype Highlighting Current Challenges

Ramprasad, Sanjana, McInerney, Denis Jered, Marshal, Iain J., Wallace, Byron C.

arXiv.org Artificial IntelligenceMar-7-2023

We present TrialsSummarizer, a system that aims to automatically summarize evidence presented in the set of randomized controlled trials most relevant to a given query. Building on prior work, the system retrieves trial publications matching a query specifying a combination of condition, intervention(s), and outcome(s), and ranks these according to sample size and estimated study quality. The top-k such studies are passed through a neural multi-document summarization system, yielding a synopsis of these trials. We consider two architectures: A standard sequence-to-sequence model based on BART and a multi-headed architecture intended to provide greater transparency to end-users. Both models produce fluent and relevant summaries of evidence retrieved for queries, but their tendency to introduce unsupported statements render them inappropriate for use in this domain at present. The proposed architecture may help users verify outputs allowing users to trace generated tokens back to inputs.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2303.05392

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

That's the Wrong Lung! Evaluating and Improving the Interpretability of Unsupervised Multimodal Encoders for Medical Data

McInerney, Denis Jered, Young, Geoffrey, van de Meent, Jan-Willem, Wallace, Byron C.

arXiv.org Artificial IntelligenceOct-22-2022

Pretraining multimodal models on Electronic Health Records (EHRs) provides a means of learning representations that can transfer to downstream tasks with minimal supervision. Recent multimodal models induce soft local alignments between image regions and sentences. This is of particular interest in the medical domain, where alignments might highlight regions in an image relevant to specific phenomena described in free-text. While past work has suggested that attention "heatmaps" can be interpreted in this manner, there has been little evaluation of such alignments. We compare alignments from a state-of-the-art multimodal (image and text) model for EHR with human annotations that link image regions to sentences. Our main finding is that the text has an often weak or unintuitive influence on attention; alignments do not consistently reflect basic anatomical information. Moreover, synthetic modifications -- such as substituting "left" for "right" -- do not substantially influence highlights. Simple techniques such as allowing the model to opt out of attending to the image and few-shot finetuning show promise in terms of their ability to improve alignments with very little or no supervision. We make our code and checkpoints open-source.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2210.06565

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Health Care Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback