Goto

Collaborating Authors

 positive predictive value


High-performance automated abstract screening with large language model ensembles

Sanghera, Rohan, Thirunavukarasu, Arun James, Khoury, Marc El, O'Logbon, Jessica, Chen, Yuqing, Watt, Archie, Mahmood, Mustafa, Butt, Hamid, Nishimura, George, Soltan, Andrew

arXiv.org Artificial Intelligence

Large language models (LLMs) excel in tasks requiring processing and interpretation of input text. Abstract screening is a labour-intensive component of systematic review involving repetitive application of inclusion and exclusion criteria on a large volume of studies identified by a literature search. Here, LLMs (GPT-3.5 Turbo, GPT-4 Turbo, GPT-4o, Llama 3 70B, Gemini 1.5 Pro, and Claude Sonnet 3.5) were trialled on systematic reviews in a full issue of the Cochrane Library to evaluate their accuracy in zero-shot binary classification for abstract screening. Trials over a subset of 800 records identified optimal prompting strategies and demonstrated superior performance of LLMs to human researchers in terms of sensitivity (LLM-max = 1.000, human-max = 0.775), precision (LLM-max = 0.927, human-max = 0.911), and balanced accuracy (LLM-max = 0.904, human-max = 0.865). The best performing LLM-prompt combinations were trialled across every replicated search result (n = 119,691), and exhibited consistent sensitivity (range 0.756-1.000) but diminished precision (range 0.004-0.096). 66 LLM-human and LLM-LLM ensembles exhibited perfect sensitivity with a maximal precision of 0.458, with less observed performance drop in larger trials. Significant variation in performance was observed between reviews, highlighting the importance of domain-specific validation before deployment. LLMs may reduce the human labour cost of systematic review with maintained or improved accuracy and sensitivity. Systematic review is the foundation of evidence synthesis across academic disciplines, including evidence-based medicine, and LLMs may increase the efficiency and quality of this mode of research.


Validation of a new, minimally-invasive, software smartphone device to predict sleep apnea and its severity: transversal study

Frija, Justine, Millet, Juliette, Bequignon, Emilie, Covali, Ala, Cathelain, Guillaume, Houenou, Josselin, Benzaquen, Helene, Geoffroy, Pierre Alexis, Bacry, Emmanuel, Grajoszex, Mathieu, Ortho, Marie-Pia d

arXiv.org Artificial Intelligence

Obstructive sleep apnea (OSA) is frequent and responsible for cardiovascular complications and excessive daytime sleepiness. It is underdiagnosed due to the difficulty to access the gold standard for diagnosis, polysomnography (PSG). Alternative methods using smartphone sensors could be useful to increase diagnosis. The objective is to assess the performances of Apneal, an application that records the sound using a smartphone's microphone and movements thanks to a smartphone's accelerometer and gyroscope, to estimate patients' AHI. In this article, we perform a monocentric proof-of-concept study with a first manual scoring step, and then an automatic detection of respiratory events from the recorded signals using a sequential deep-learning model which was released internally at Apneal at the end of 2022 (version 0.1 of Apneal automatic scoring of respiratory events), in adult patients during in-hospital polysomnography.46 patients (women 34 per cent, mean BMI 28.7 kg per m2) were included. For AHI superior to 15, sensitivity of manual scoring was 0.91, and positive predictive value (PPV) 0.89. For AHI superior to 30, sensitivity was 0.85, PPV 0.94. We obtained an AUC-ROC of 0.85 and an AUC-PR of 0.94 for the identification of AHI superior to 15, and AUC-ROC of 0.95 and AUC-PR of 0.93 for AHI superior to 30. Promising results are obtained for the automatic annotations of events.This article shows that manual scoring of smartphone-based signals is possible and accurate compared to PSG-based scorings. Automatic scoring method based on a deep learning model provides promising results. A larger multicentric validation study, involving subjects with different SAHS severity is required to confirm these results.


A Priori Determination of the Pretest Probability

Balayla, Jacques

arXiv.org Artificial Intelligence

In this manuscript, we present various proposed methods estimate the prevalence of disease, a critical prerequisite for the adequate interpretation of screening tests. To address the limitations of these approaches, which revolve primarily around their a posteriori nature, we introduce a novel method to estimate the pretest probability of disease, a priori, utilizing the Logit function from the logistic regression model. This approach is a modification of McGee's heuristic, originally designed for estimating the posttest probability of disease. In a patient presenting with $n_\theta$ signs or symptoms, the minimal bound of the pretest probability, $\phi$, can be approximated by: $\phi \approx \frac{1}{5}{ln\left[\displaystyle\prod_{\theta=1}^{i}\kappa_\theta\right]}$ where $ln$ is the natural logarithm, and $\kappa_\theta$ is the likelihood ratio associated with the sign or symptom in question.


Study: AI Improves Cancer Detection Rate for Digital Mammography and Digital Breast Tomosynthesis

#artificialintelligence

The use of adjunctive artificial intelligence (AI) doubled the positive predictive value (PPV) of digital mammography (DM) exams overall and led to greater than 90 percent accuracy for DM and digital breast tomosynthesis (DBT) in detecting breast cancer in women with elevated risk, according to research findings presented recently at the European Congress of Radiology (ECR) conference in Vienna, Austria. For the study, researchers compared the use of adjunctive AI (Transpara version 1.7.0, ScreenPoint Medical) in 11,988 women (between the ages of 50 and 74) who had DM or DBT screening exams versus 16,555 women screened with DM or DBT the year before without AI support. In the AI group, 5,049 women had DM screening with the Hologic Selenia device and 6,949 women had DBT screening with the Hologic Selenia Dimensions device, according to the study. For the non-AI cohort, 7,229 women had DM screening and 9,326 women had DBT.


Using natural language processing and structured medical data to phenotype patients hospitalized due to COVID-19

Chang, Feier, Krishnan, Jay, Hurst, Jillian H, Yarrington, Michael E, Anderson, Deverick J, O'Brien, Emily C, Goldstein, Benjamin A

arXiv.org Artificial Intelligence

To identify patients who are hospitalized because of COVID-19 as opposed to those who were admitted for other indications, we compared the performance of different computable phenotype definitions for COVID-19 hospitalizations that use different types of data from the electronic health records (EHR), including structured EHR data elements, provider notes, or a combination of both data types. And conduct a retrospective data analysis utilizing chart review-based validation. Participants are 586 hospitalized individuals who tested positive for SARS-CoV-2 during January 2022. We used natural language processing to incorporate data from provider notes and LASSO regression and Random Forests to fit classification algorithms that incorporated structured EHR data elements, provider notes, or a combination of structured data and provider notes. Results: Based on a chart review, 38% of 586 patients were determined to be hospitalized for reasons other than COVID-19 despite having tested positive for SARS-CoV-2. A classification algorithm that used provider notes had significantly better discrimination than one that used structured EHR data elements (AUROC: 0.894 vs 0.841, p < 0.001), and performed similarly to a model that combined provider notes with structured data elements (AUROC: 0.894 vs 0.893). Assessments of hospital outcome metrics significantly differed based on whether the population included all hospitalized patients who tested positive for SARS-CoV-2 versus those who were determined to have been hospitalized due to COVID-19. This work demonstrates the utility of natural language processing approaches to derive information related to patient hospitalizations in cases where there may be multiple conditions that could serve as the primary indication for hospitalization.


New AI Can Automatically Detect a Serious Heart Condition

#artificialintelligence

With a 73 percent positive predictive value, the AI technique accurately identified 80 percent of the instances of plaque erosion. Researchers have created a brand-new artificial intelligence (AI) technique that uses optical coherence tomography (OCT) images to automatically detect plaque erosion in the arteries of the heart. Monitoring arterial plaque is crucial because, if it disintegrates, it may obstruct blood flow to the heart, triggering a heart attack or other dangerous problems. "If cholesterol plaque lining arteries starts to erode it can lead to a sudden reduction in blood flow to the heart known as acute coronary syndrome, which requires urgent treatment," said research team leader Zhao Wang from the University of Electronic Science and Technology of China. "Our new method could help improve the clinical diagnosis of plaque erosion and be used to develop new treatments for patients with heart disease."


@Radiology_AI

#artificialintelligence

"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. To develop an artificial intelligence (AI)-based model to detect mitral regurgitation (MR) from chest radiographs. This retrospective study included echocardiography and associated chest radiographs consecutively collected at our institution between July 2016 and May 2019.


Development of Machine Learning Models to Predict Probabilities and Types of Stroke at Prehospital Stage: the Japan Urgent Stroke Triage Score Using Machine Learning (JUST-ML) - PubMed

#artificialintelligence

In conjunction with recent advancements in machine learning (ML), such technologies have been applied in various fields owing to their high predictive performance. We tried to develop prehospital stroke scale with ML. We conducted multi-center retrospective and prospective cohort study. The training cohort had eight centers in Japan from June 2015 to March 2018, and the test cohort had 13 centers from April 2019 to March 2020. We use the three different ML algorithms (logistic regression, random forests, XGBoost) to develop models.


AI-Enabled ECG Helps Identify Heart Failure

#artificialintelligence

The article, "AI-Enabled ECG Improves Ability to Identify Heart Failure in Emergency Departments," was originally published on Practical Cardiology. An artificial intelligence (AI)-enabled electrocardiogram (ECG) could aid clinicians in emergency departments more accurately identify heart failure. Findings from the study indicate the AI-enhanced ECG could improve identification of left ventricular systolic dysfunction in patients presenting the emergency departments with acute dyspnea. "AI-enhanced ECGs are quicker and outperform current standard-of-care tests. Our results suggest that high-risk cardiac patients can be identified quicker in the emergency department and provides an opportunity to link them early to appropriate cardiovascular care," said lead investigator Demilade Adedinsewo, MD, MPH, chief fellow in the division of cardiovascular medicine at Mayo Clinic in Jacksonville, Florida, in a statement.


High pooled performance of convolutional neural networks in computer-aided diagnosis of GI ulcers and/or hemorrhage on wireless capsule endoscopy images: a systematic review and meta-analysis

#artificialintelligence

Diagnosis of gastrointestinal (GI) ulcers and/or hemorrhage by wireless capsule endoscopy (WCE) is limited by the physician-dependent, tedious, time-consuming process of image and/ or video classification. Computer-aided diagnosis (CAD) by convolutional neural networks (CNN) based machine learning may help reduce this burden. Our aim was to conduct a meta-analysis and appraise the reported data.