AITopics | Hyland, Stephanie

Collaborating Authors

Hyland, Stephanie

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

An X-Ray Is Worth 15 Features: Sparse Autoencoders for Interpretable Radiology Report Generation

Abdulaal, Ahmed, Fry, Hugo, Montaña-Brown, Nina, Ijishakin, Ayodeji, Gao, Jack, Hyland, Stephanie, Alexander, Daniel C., Castro, Daniel C.

arXiv.org Artificial IntelligenceOct-4-2024

Radiological services are experiencing unprecedented demand, leading to increased interest in automating radiology report generation. Existing Vision-Language Models (VLMs) suffer from hallucinations, lack interpretability, and require expensive fine-tuning. We introduce SAE-Rad, which uses sparse autoencoders (SAEs) to decompose latent representations from a pre-trained vision transformer into human-interpretable features. Our hybrid architecture combines state-of-the-art SAE advancements, achieving accurate latent reconstructions while maintaining sparsity. Using an off-the-shelf language model, we distil ground-truth reports into radiological descriptions for each SAE feature, which we then compile into a full report for each image, eliminating the need for fine-tuning large models for this task. To the best of our knowledge, SAE-Rad represents the first instance of using mechanistic interpretability techniques explicitly for a downstream multi-modal reasoning task. On the MIMIC-CXR dataset, SAE-Rad achieves competitive radiology-specific metrics compared to state-of-the-art models while using significantly fewer computational resources for training. Qualitative analysis reveals that SAE-Rad learns meaningful visual concepts and generates reports aligning closely with expert interpretations. Our results suggest that SAEs can enhance multimodal reasoning in healthcare, providing a more interpretable alternative to existing VLMs.

machine learning, natural language, radiology report, (15 more...)

arXiv.org Artificial Intelligence

2410.03334

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Exploring the Boundaries of GPT-4 in Radiology

Liu, Qianchu, Hyland, Stephanie, Bannur, Shruthi, Bouzid, Kenza, Castro, Daniel C., Wetscherek, Maria Teodora, Tinn, Robert, Sharma, Harshita, Pérez-García, Fernando, Schwaighofer, Anton, Rajpurkar, Pranav, Khanna, Sameer Tajdin, Poon, Hoifung, Usuyama, Naoto, Thieme, Anja, Nori, Aditya V., Lungren, Matthew P., Oktay, Ozan, Alvarez-Valle, Javier

arXiv.org Artificial IntelligenceOct-23-2023

The recent success of general-domain large language models (LLMs) has significantly changed the natural language processing paradigm towards a unified foundation model across domains and applications. In this paper, we focus on assessing the performance of GPT-4, the most capable LLM so far, on the text-based applications for radiology reports, comparing against state-of-the-art (SOTA) radiology-specific models. Exploring various prompting strategies, we evaluated GPT-4 on a diverse range of common radiology tasks and we found GPT-4 either outperforms or is on par with current SOTA radiology models. With zero-shot prompting, GPT-4 already obtains substantial gains ($\approx$ 10% absolute improvement) over radiology models in temporal sentence similarity classification (accuracy) and natural language inference ($F_1$). For tasks that require learning dataset-specific style or schema (e.g. findings summarisation), GPT-4 improves with example-based prompting and matches supervised SOTA. Our extensive error analysis with a board-certified radiologist shows GPT-4 has a sufficient level of radiology knowledge with only occasional errors in complex context that require nuanced domain knowledge. For findings summarisation, GPT-4 outputs are found to be overall comparable with existing manually-written impressions.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2310.14573

Country:

Europe (0.28)
Asia > China (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing

Bannur, Shruthi, Hyland, Stephanie, Liu, Qianchu, Pérez-García, Fernando, Ilse, Maximilian, Castro, Daniel C., Boecking, Benedikt, Sharma, Harshita, Bouzid, Kenza, Thieme, Anja, Schwaighofer, Anton, Wetscherek, Maria, Lungren, Matthew P., Nori, Aditya, Alvarez-Valle, Javier, Oktay, Ozan

arXiv.org Artificial IntelligenceMar-16-2023

Self-supervised learning in vision-language processing exploits semantic alignment between imaging and text modalities. Prior work in biomedical VLP has mostly relied on the alignment of single image and report pairs even though clinical notes commonly refer to prior images. This does not only introduce poor alignment between the modalities but also a missed opportunity to exploit rich self-supervision through existing temporal content in the data. In this work, we explicitly account for prior images and reports when available during both training and fine-tuning. Our approach, named BioViL-T, uses a CNN-Transformer hybrid multi-image encoder trained jointly with a text model. It is designed to be versatile to arising challenges such as pose variations and missing input images across time. The resulting model excels on downstream tasks both in single- and multi-image setups, achieving state-of-the-art performance on (I) progression classification, (II) phrase grounding, and (III) report generation, whilst offering consistent improvements on disease classification and sentence-similarity tasks. We release a novel multi-modal temporal benchmark dataset, MS-CXR-T, to quantify the quality of vision-language representations in terms of temporal semantics. Our experimental results show the advantages of incorporating prior images and reports to make most use of the data.

artificial intelligence, biomedical vision-language processing, exploit temporal structure, (1 more...)

arXiv.org Artificial Intelligence

2301.04558

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.60)

Add feedback

Making the Most of Text Semantics to Improve Biomedical Vision--Language Processing

Boecking, Benedikt, Usuyama, Naoto, Bannur, Shruthi, Castro, Daniel C., Schwaighofer, Anton, Hyland, Stephanie, Wetscherek, Maria, Naumann, Tristan, Nori, Aditya, Alvarez-Valle, Javier, Poon, Hoifung, Oktay, Ozan

arXiv.org Artificial IntelligenceJul-21-2022

Multi-modal data abounds in biomedicine, such as radiology images and reports. Interpreting this data at scale is essential for improving clinical care and accelerating clinical research. Biomedical text with its complex semantics poses additional challenges in vision--language modelling compared to the general domain, and previous work has used insufficiently adapted models that lack domain-specific language understanding. In this paper, we show that principled textual semantic modelling can substantially improve contrastive learning in self-supervised vision--language processing. We release a language model that achieves state-of-the-art results in radiology natural language inference through its improved vocabulary and novel language pretraining objective leveraging semantics and discourse characteristics in radiology reports. Further, we propose a self-supervised joint vision--language approach with a focus on better text modelling. It establishes new state of the art results on a wide range of publicly available benchmarks, in part by leveraging our new domain-specific language model. We release a new dataset with locally-aligned phrase grounding annotations by radiologists to facilitate the study of complex semantic modelling in biomedical vision--language processing. A broad evaluation, including on this new dataset, shows that our contrastive learning approach, aided by textual-semantic modelling, outperforms prior methods in segmentation tasks, despite only using a global-alignment objective.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-20059-5_1

2204.09817

Country: North America > United States > Minnesota (0.28)

Genre:

Research Report > Experimental Study (0.48)
Research Report > New Finding (0.48)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)

Add feedback

Temporal Pointwise Convolutional Networks for Length of Stay Prediction in the Intensive Care Unit

Rocheteau, Emma, Liò, Pietro, Hyland, Stephanie

arXiv.org Artificial IntelligenceJul-18-2020

The pressure of ever-increasing patient demand and budget restrictions make hospital bed management a daily challenge for clinical staff. Most critical is the efficient allocation of resource-heavy Intensive Care Unit (ICU) beds to the patients who need life support. Central to solving this problem is knowing for how long the current set of ICU patients are likely to stay in the unit. In this work, we propose a new deep learning model based on the combination of temporal convolution and pointwise (1x1) convolution, to solve the length of stay prediction task on the eICU critical care dataset. The model - which we refer to as Temporal Pointwise Convolution (TPC) - is specifically designed to mitigate for common challenges with Electronic Health Records, such as skewness, irregular sampling and missing data. In doing so, we have achieved significant performance benefits of 18-51% (metric dependent) over the commonly used Long-Short Term Memory (LSTM) network, and the multi-head self-attention network known as the Transformer.

deep learning, lstm, neural network, (21 more...)

arXiv.org Artificial Intelligence

2007.09483

Country: