AITopics | Matos, João

Collaborating Authors

Matos, João

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Classification of Keratitis from Eye Corneal Photographs using Deep Learning

Beirão, Maria Miguel, Matos, João, Gonçalves, Tiago, Kase, Camila, Nakayama, Luis Filipe, de Freitas, Denise, Cardoso, Jaime S.

arXiv.org Artificial IntelligenceNov-13-2024

Keratitis is an inflammatory corneal condition responsible for 10% of visual impairment in low- and middle-income countries (LMICs), with bacteria, fungi, or amoeba as the most common infection etiologies. While an accurate and timely diagnosis is crucial for the selected treatment and the patients' sight outcomes, due to the high cost and limited availability of laboratory diagnostics in LMICs, diagnosis is often made by clinical observation alone, despite its lower accuracy. In this study, we investigate and compare different deep learning approaches to diagnose the source of infection: 1) three separate binary models for infection type predictions; 2) a multitask model with a shared backbone and three parallel classification layers (Multitask V1); and, 3) a multitask model with a shared backbone and a multi-head classification layer (Multitask V2). We used a private Brazilian cornea dataset to conduct the empirical evaluation. We achieved the best results with Multitask V2, with an area under the receiver operating characteristic curve (AUROC) confidence intervals of 0.7413-0.7740 (bacteria), 0.8395-0.8725 (fungi), and 0.9448-0.9616 (amoeba). A statistical analysis of the impact of patient features on models' performance revealed that sex significantly affects amoeba infection prediction, and age seems to affect fungi and bacteria predictions.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2411.08935

Country:

Europe (0.69)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

WorldMedQA-V: a multilingual, multimodal medical examination dataset for multimodal language models evaluation

Matos, João, Chen, Shan, Placino, Siena, Li, Yingya, Pardo, Juan Carlos Climent, Idan, Daphna, Tohyama, Takeshi, Restrepo, David, Nakayama, Luis F., Pascual-Leone, Jose M. M., Savova, Guergana, Aerts, Hugo, Celi, Leo A., Wong, A. Ian, Bitterman, Danielle S., Gallifant, Jack

arXiv.org Artificial IntelligenceOct-16-2024

Multimodal/vision language models (VLMs) are increasingly being deployed in healthcare settings worldwide, necessitating robust benchmarks to ensure their safety, efficacy, and fairness. Multiple-choice question and answer (QA) datasets derived from national medical examinations have long served as valuable evaluation tools, but existing datasets are largely text-only and available in a limited subset of languages and countries. To address these challenges, we present WorldMedQA-V, an updated multilingual, multimodal benchmarking dataset designed to evaluate VLMs in healthcare. WorldMedQA-V includes 568 labeled multiple-choice QAs paired with 568 medical images from four countries (Brazil, Israel, Japan, and Spain), covering original languages and validated English translations by native clinicians, respectively. Baseline performance for common open- and closed-source models are provided in the local language and English translations, and with and without images provided to the model. The WorldMedQA-V benchmark aims to better match AI systems to the diverse healthcare environments in which they are deployed, fostering more equitable, effective, and representative applications.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.12722

Country:

Europe > Spain (0.37)
Asia > Japan (0.37)
South America > Brazil (0.36)
Asia > Middle East > Israel (0.27)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

EHRmonize: A Framework for Medical Concept Abstraction from Electronic Health Records using Large Language Models

Matos, João, Gallifant, Jack, Pei, Jian, Wong, A. Ian

arXiv.org Artificial IntelligenceJun-28-2024

Electronic health records (EHRs) contain vast amounts of complex data, but harmonizing and processing this information remains a challenging and costly task requiring significant clinical expertise. While large language models (LLMs) have shown promise in various healthcare applications, their potential for abstracting medical concepts from EHRs remains largely unexplored. We introduce EHRmonize, a framework leveraging LLMs to abstract medical concepts from EHR data. Our study uses medication data from two real-world EHR databases to evaluate five LLMs on two free-text extraction and six binary classification tasks across various prompting strategies. GPT-4o's with 10-shot prompting achieved the highest performance in all tasks, accompanied by Claude-3.5-Sonnet in a subset of tasks. GPT-4o achieved an accuracy of 97% in identifying generic route names, 82% for generic drug names, and 100% in performing binary classification of antibiotics. While EHRmonize significantly enhances efficiency, reducing annotation time by an estimated 60%, we emphasize that clinician oversight remains essential. Our framework, available as a Python package, offers a promising tool to assist clinicians in EHR data abstraction, potentially accelerating healthcare research and improving data harmonization processes.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2407.00242

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback