Goto

Collaborating Authors

 clinical environment


Pic2Diagnosis: A Method for Diagnosis of Cardiovascular Diseases from the Printed ECG Pictures

Büyüksolak, Oğuzhan, Öksüz, İlkay

arXiv.org Artificial Intelligence

The electrocardiogram (ECG) is a vital tool for diagnosing heart diseases. However, many disease patterns are derived from outdated datasets and traditional stepwise algorithms with limited accuracy. This study presents a method for direct cardiovascular disease (CVD) diagnosis from ECG images, eliminating the need for digitization. The proposed approach utilizes a two-step curriculum learning framework, beginning with the pre-training of a classification model on segmentation masks, followed by fine-tuning on grayscale, inverted ECG images. Robustness is further enhanced through an ensemble of three models with averaged outputs, achieving an AUC of 0.9534 and an F1 score of 0.7801 on the BHF ECG Challenge dataset, outperforming individual models. By effectively handling real-world artifacts and simplifying the diagnostic process, this method offers a reliable solution for automated CVD diagnosis, particularly in resource-limited settings where printed or scanned ECG images are commonly used. Such an automated procedure enables rapid and accurate diagnosis, which is critical for timely intervention in CVD cases that often demand urgent care.


Beyond MedQA: Towards Real-world Clinical Decision Making in the Era of LLMs

Xiao, Yunpeng, Yang, Carl, Mai, Mark, Hu, Xiao, Shu, Kai

arXiv.org Artificial Intelligence

Large language models (LLMs) show promise for clinical use. They are often evaluated using datasets such as MedQA. However, Many medical datasets, such as MedQA, rely on simplified Question-Answering (Q\A) that underrepresents real-world clinical decision-making. Based on this, we propose a unifying paradigm that characterizes clinical decision-making tasks along two dimensions: Clinical Backgrounds and Clinical Questions. As the background and questions approach the real clinical environment, the difficulty increases. We summarize the settings of existing datasets and benchmarks along two dimensions. Then we review methods to address clinical decision-making, including training-time and test-time techniques, and summarize when they help. Next, we extend evaluation beyond accuracy to include efficiency, explainability. Finally, we highlight open challenges. Our paradigm clarifies assumptions, standardizes comparisons, and guides the development of clinically meaningful LLMs.


A Deep Learning Framework for Real-Time Image Processing in Medical Diagnostics: Enhancing Accuracy and Speed in Clinical Applications

Filvantorkaman, Melika, Torkaman, Maral Filvan

arXiv.org Artificial Intelligence

Medical imaging plays a vital role in modern diagnostics; however, interpreting high-resolution radiological data remains time-consuming and susceptible to variability among clinicians. Traditional image processing techniques often lack the precision, robustness, and speed required for real-time clinical use. To overcome these limitations, this paper introduces a deep learning framework for real-time medical image analysis designed to enhance diagnostic accuracy and computational efficiency across multiple imaging modalities, including X-ray, CT, and MRI. The proposed system integrates advanced neural network architectures such as U-Net, EfficientNet, and Transformer-based models with real-time optimization strategies including model pruning, quantization, and GPU acceleration. The framework enables flexible deployment on edge devices, local servers, and cloud infrastructures, ensuring seamless interoperability with clinical systems such as PACS and EHR. Experimental evaluations on public benchmark datasets demonstrate state-of-the-art performance, achieving classification accuracies above 92%, segmentation Dice scores exceeding 91%, and inference times below 80 milliseconds. Furthermore, visual explanation tools such as Grad-CAM and segmentation overlays enhance transparency and clinical interpretability. These results indicate that the proposed framework can substantially accelerate diagnostic workflows, reduce clinician workload, and support trustworthy AI integration in time-critical healthcare environments.


Adaptive Reasoning and Acting in Medical Language Agents

Dutta, Abhishek, Hsiao, Yen-Che

arXiv.org Artificial Intelligence

This paper presents an innovative large language model (LLM) agent framework for enhancing diagnostic accuracy in simulated clinical environments using the AgentClinic benchmark. The proposed automatic correction enables doctor agents to iteratively refine their reasoning and actions following incorrect diagnoses, fostering improved decision-making over time. Experiments show that the implementation of the adaptive LLM-based doctor agents achieve correct diagnoses through dynamic interactions with simulated patients. The evaluations highlight the capacity of autonomous agents to adapt and improve in complex medical scenarios. Future enhancements will focus on refining the algorithm and expanding its applicability across a wider range of tasks and different large language models.


AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments

Schmidgall, Samuel, Ziaei, Rojin, Harris, Carl, Reis, Eduardo, Jopling, Jeffrey, Moor, Michael

arXiv.org Artificial Intelligence

Diagnosing and managing a patient is a complex, sequential decision making process that requires physicians to obtain information -- such as which tests to perform -- and to act upon it. Recent advances in artificial intelligence (AI) and large language models (LLMs) promise to profoundly impact clinical care. However, current evaluation schemes overrely on static medical question-answering benchmarks, falling short on interactive decision-making that is required in real-life clinical work. Here, we present AgentClinic: a multimodal benchmark to evaluate LLMs in their ability to operate as agents in simulated clinical environments. In our benchmark, the doctor agent must uncover the patient's diagnosis through dialogue and active data collection. We present two open medical agent benchmarks: a multimodal image and dialogue environment, AgentClinic-NEJM, and a dialogue-only environment, AgentClinic-MedQA. We embed cognitive and implicit biases both in patient and doctor agents to emulate realistic interactions between biased agents. We find that introducing bias leads to large reductions in diagnostic accuracy of the doctor agents, as well as reduced compliance, confidence, and follow-up consultation willingness in patient agents. Evaluating a suite of state-of-the-art LLMs, we find that several models that excel in benchmarks like MedQA are performing poorly in AgentClinic-MedQA. We find that the LLM used in the patient agent is an important factor for performance in the AgentClinic benchmark. We show that both having limited interactions as well as too many interaction reduces diagnostic accuracy in doctor agents. The code and data for this work is publicly available at https://AgentClinic.github.io.


Artificial intelligence projects in healthcare: 10 practical tips for success in a clinical environment

#artificialintelligence

There is much discussion concerning ‘digital transformation’ in healthcare and the potential of artificial intelligence (AI) in healthcare systems. Yet it remains rare to find AI solutions deployed in routine healthcare settings. This is in part due to the numerous challenges inherent in delivering an AI project in a clinical environment. In this article, several UK healthcare professionals and academics reflect on the challenges they have faced in building AI solutions using routinely collected healthcare data. These personal reflections are summarised as 10 practical tips. In our experience, these are essential considerations for an AI healthcare project to succeed. They are organised into four phases: conceptualisation, data management, AI application and clinical deployment. There is a focus on conceptualisation, reflecting our view that initial set-up is vital to success. We hope that our personal experiences will provide useful insights to others looking to improve patient care through optimal data use. No data are available to share.


FedDICE: A ransomware spread detection in a distributed integrated clinical environment using federated learning and SDN based mitigation

Thapa, Chandra, Karmakar, Kallol Krishna, Celdran, Alberto Huertas, Camtepe, Seyit, Varadharajan, Vijay, Nepal, Surya

arXiv.org Artificial Intelligence

An integrated clinical environment (ICE) enables the connection and coordination of the internet of medical things around the care of patients in hospitals. However, ransomware attacks and their spread on hospital infrastructures, including ICE, are rising. Often the adversaries are targeting multiple hospitals with the same ransomware attacks. These attacks are detected by using machine learning algorithms. But the challenge is devising the anti-ransomware learning mechanisms and services under the following conditions: (1) provide immunity to other hospitals if one of them got the attack, (2) hospitals are usually distributed over geographical locations, and (3) direct data sharing is avoided due to privacy concerns. In this regard, this paper presents a federated distributed integrated clinical environment, aka. FedDICE. FedDICE integrates federated learning (FL), which is privacy-preserving learning, to SDN-oriented security architecture to enable collaborative learning, detection, and mitigation of ransomware attacks. We demonstrate the importance of FedDICE in a collaborative environment with up to four hospitals and four popular ransomware families, namely WannaCry, Petya, BadRabbit, and PowerGhost. Our results find that in both IID and non-IID data setups, FedDICE achieves the centralized baseline performance that needs direct data sharing for detection. However, as a trade-off to data privacy, FedDICE observes overhead in the anti-ransomware model training, e.g., 28x for the logistic regression model. Besides, FedDICE utilizes SDN's dynamic network programmability feature to remove the infected devices in ICE.


AI outperforms humans in creating cancer treatments, but do doctors trust it?

#artificialintelligence

The impact of deploying Artificial Intelligence (AI) for radiation cancer therapy in a real-world clinical setting has been tested by Princess Margaret researchers in a unique study involving physicians and their patients. A team of researchers directly compared physician evaluations of radiation treatments generated by an AI machine learning (ML) algorithm to conventional radiation treatments generated by humans. They found that in the majority of the 100 patients studied, treatments generated using ML were deemed to be clinically acceptable for patient treatments by physicians. Overall, 89% of ML-generated treatments were considered clinically acceptable for treatments, and 72% were selected over human-generated treatments in head-to-head comparisons to conventional human-generated treatments. Moreover, the ML radiation treatment process was faster than the conventional human-driven process by 60%, reducing the overall time from 118 hours to 47 hours.


AI outperforms humans in creating cancer treatments, but do doctors trust it?

#artificialintelligence

The impact of deploying Artificial Intelligence (AI) for radiation cancer therapy in a real-world clinical setting has been tested by Princess Margaret researchers in a unique study involving physicians and their patients. A team of researchers directly compared physician evaluations of radiation treatments generated by an AI machine learning (ML) algorithm to conventional radiation treatments generated by humans. They found that in the majority of the 100 patients studied, treatments generated using ML were deemed to be clinically acceptable for patient treatments by physicians. Overall, 89% of ML-generated treatments were considered clinically acceptable for treatments, and 72% were selected over human-generated treatments in head-to-head comparisons to conventional human-generated treatments. Moreover, the ML radiation treatment process was faster than the conventional human-driven process by 60%, reducing the overall time from 118 hours to 47 hours.


Clinical management of sepsis can be improved by artificial intelligence: yes

#artificialintelligence

The management of sepsis is a highly complex, multifaceted challenge that remains the realm of highly skilled and trained human experts. But as medical applications of artificial intelligence continue to pour in, it is becoming obvious that some of these decisions could soon be left to machines that could be dubbed "intelligent", improving clinical practice and patient outcomes [1]. Indeed, most of the tasks involved in the clinical management of sepsis (early recognition, selection of antibiotic therapy, haemodynamic optimisation, etc.) could be individually performed or optimised by dedicated algorithms. Most of what we call "artificial intelligence" is in fact machine learning--a set of computer tools intended to generate new knowledge from data [1]. Machine learning includes three categories of techniques: supervised (which uses labelled data to build a prediction model, for example for prognostication), unsupervised (which discovers patterns in data and generates clusters of subjects that share common characteristics) and reinforcement learning (where a sequential decision process is modelled and optimised). Below, I have selected a few significant applications that I consider the most likely to land in the clinical environment in the near future, either because of their robustness or their potential.