dermatology
- Oceania > Australia (0.04)
- South America > Colombia (0.04)
- Oceania > New Zealand (0.04)
- (3 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.94)
- Health & Medicine > Therapeutic Area > Dermatology (1.00)
- Health & Medicine > Therapeutic Area > Oncology (0.68)
DermETAS-SNA LLM: A Dermatology Focused Evolutionary Transformer Architecture Search with StackNet Augmented LLM Assistant
Oruganty, Nitya Phani Santosh, Murali, Keerthi Vemula, Ngan, Chun-Kit, Pinho, Paulo Bandeira
Our work introduces the DermETAS-SNA LLM Assistant that integrates Dermatology-focused Evolutionary Transformer Architecture Search with StackNet Augmented LLM. The assistant dynamically learns skin-disease classifiers and provides medically informed descriptions to facilitate clinician-patient interpretation. Contributions include: (1) Developed an ETAS framework on the SKINCON dataset to optimize a Vision Transformer (ViT) tailored for dermatological feature representation and then fine-tuned binary classifiers for each of the 23 skin disease categories in the DermNet dataset to enhance classification performance; (2) Designed a StackNet architecture that integrates multiple fine-tuned binary ViT classifiers to enhance predictive robustness and mitigate class imbalance issues; (3) Implemented a RAG pipeline, termed Diagnostic Explanation and Retrieval Model for Dermatology, which harnesses the capabilities of the Google Gemini 2.5 Pro LLM architecture to generate personalized, contextually informed diagnostic descriptions and explanations for patients, leveraging a repository of verified dermatological materials; (4) Performed extensive experimental evaluations on 23 skin disease categories to demonstrate performance increase, achieving an overall F1-score of 56.30% that surpasses SkinGPT-4 (48.51%) by a considerable margin, representing a performance increase of 16.06%; (5) Conducted a domain-expert evaluation, with eight licensed medical doctors, of the clinical responses generated by our AI assistant for seven dermatological conditions. Our results show a 92% agreement rate with the assessments provided by our AI assistant (6) Created a proof-of-concept prototype that fully integrates our DermETAS-SNA LLM into our AI assistant to demonstrate its practical feasibility for real-world clinical and educational applications.
- North America > United States > Massachusetts (0.04)
- North America > United States > New Jersey (0.04)
- Asia > Indonesia > Bali (0.04)
DIsoN: Decentralized Isolation Networks for Out-of-Distribution Detection in Medical Imaging
Wagner, Felix, Saha, Pramit, Anthony, Harry, Noble, J. Alison, Kamnitsas, Konstantinos
Safe deployment of machine learning (ML) models in safety-critical domains such as medical imaging requires detecting inputs with characteristics not seen during training, known as out-of-distribution (OOD) detection, to prevent unreliable predictions. Effective OOD detection after deployment could benefit from access to the training data, enabling direct comparison between test samples and the training data distribution to identify differences. State-of-the-art OOD detection methods, however, either discard the training data after deployment or assume that test samples and training data are centrally stored together, an assumption that rarely holds in real-world settings. This is because shipping the training data with the deployed model is usually impossible due to the size of training databases, as well as proprietary or privacy constraints. We introduce the Isolation Network, an OOD detection framework that quantifies the difficulty of separating a target test sample from the training data by solving a binary classification task. We then propose Decentralized Isolation Networks (DIsoN), which enables the comparison of training and test data when data-sharing is impossible, by exchanging only model parameters between the remote computational nodes of training and deployment. We further extend DIsoN with class-conditioning, comparing a target sample solely with training data of its predicted class. We evaluate DIsoN on four medical imaging datasets (dermatology, chest X-ray, breast ultrasound, histopathology) across 12 OOD detection tasks. DIsoN performs favorably against existing methods while respecting data-privacy. This decentralized OOD detection framework opens the way for a new type of service that ML developers could provide along with their models: providing remote, secure utilization of their training data for OOD detection services. Code: https://github.com/FelixWag/DIsoN
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Health Care Technology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Oceania > Australia (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- South America > Colombia (0.04)
- (4 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.94)
- Health & Medicine > Therapeutic Area > Dermatology (1.00)
- Health & Medicine > Therapeutic Area > Oncology (0.68)
MedGemma Technical Report
Sellergren, Andrew, Kazemzadeh, Sahar, Jaroensri, Tiam, Kiraly, Atilla, Traverse, Madeleine, Kohlberger, Timo, Xu, Shawn, Jamil, Fayaz, Hughes, Cían, Lau, Charles, Chen, Justin, Mahvar, Fereshteh, Yatziv, Liron, Chen, Tiffany, Sterling, Bram, Baby, Stefanie Anna, Baby, Susanna Maria, Lai, Jeremy, Schmidgall, Samuel, Yang, Lu, Chen, Kejia, Bjornsson, Per, Reddy, Shashir, Brush, Ryan, Philbrick, Kenneth, Asiedu, Mercy, Mezerreg, Ines, Hu, Howard, Yang, Howard, Tiwari, Richa, Jansen, Sunny, Singh, Preeti, Liu, Yun, Azizi, Shekoofeh, Kamath, Aishwarya, Ferret, Johan, Pathak, Shreya, Vieillard, Nino, Merhej, Ramona, Perrin, Sarah, Matejovicova, Tatiana, Ramé, Alexandre, Riviere, Morgane, Rouillard, Louis, Mesnard, Thomas, Cideron, Geoffrey, Grill, Jean-bastien, Ramos, Sabela, Yvinec, Edouard, Casbon, Michelle, Buchatskaya, Elena, Alayrac, Jean-Baptiste, Lepikhin, Dmitry, Feinberg, Vlad, Borgeaud, Sebastian, Andreev, Alek, Hardin, Cassidy, Dadashi, Robert, Hussenot, Léonard, Joulin, Armand, Bachem, Olivier, Matias, Yossi, Chou, Katherine, Hassidim, Avinatan, Goel, Kavi, Farabet, Clement, Barral, Joelle, Warkentin, Tris, Shlens, Jonathon, Fleet, David, Cotruta, Victor, Sanseviero, Omar, Martins, Gus, Kirk, Phoebe, Rao, Anand, Shetty, Shravya, Steiner, David F., Kirmizibayrak, Can, Pilgrim, Rory, Golden, Daniel, Yang, Lin
Artificial intelligence (AI) has significant potential in healthcare applications, but its training and deployment faces challenges due to healthcare's diverse data, complex tasks, and the need to preserve privacy. Foundation models that perform well on medical tasks and require less task-specific tuning data are critical to accelerate the development of healthcare AI applications. We introduce MedGemma, a collection of medical vision-language foundation models based on Gemma 3 4B and 27B. MedGemma demonstrates advanced medical understanding and reasoning on images and text, significantly exceeding the performance of similar-sized generative models and approaching the performance of task-specific models, while maintaining the general capabilities of the Gemma 3 base models. For out-of-distribution tasks, MedGemma achieves 2.6-10% improvement on medical multimodal question answering, 15.5-18.1% improvement on chest X-ray finding classification, and 10.8% improvement on agentic evaluations compared to the base models. Fine-tuning MedGemma further improves performance in subdomains, reducing errors in electronic health record information retrieval by 50% and reaching comparable performance to existing specialized state-of-the-art methods for pneumothorax classification and histopathology patch classification. We additionally introduce MedSigLIP, a medically-tuned vision encoder derived from SigLIP. MedSigLIP powers the visual understanding capabilities of MedGemma and as an encoder achieves comparable or better performance than specialized medical image encoders. Taken together, the MedGemma collection provides a strong foundation of medical image and text capabilities, with potential to significantly accelerate medical research and development of downstream applications. The MedGemma collection, including tutorials and model weights, can be found at https://goo.gle/medgemma.
- North America > United States > Hawaii (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Divergent Realities: A Comparative Analysis of Human Expert vs. Artificial Intelligence Based Generation and Evaluation of Treatment Plans in Dermatology
Sengupta, Dipayan, Panda, Saumya
Background: Evaluating AI-generated treatment plans is a key challenge as AI expands beyond diagnostics, especially with new reasoning models. This study compares plans from human experts and two AI models (a generalist and a reasoner), assessed by both human peers and a superior AI judge. Methods: Ten dermatologists, a generalist AI (GPT-4o), and a reasoning AI (o3) generated treatment plans for five complex dermatology cases. The anonymized, normalized plans were scored in two phases: 1) by the ten human experts, and 2) by a superior AI judge (Gemini 2.5 Pro) using an identical rubric. Results: A profound 'evaluator effect' was observed. Human experts scored peer-generated plans significantly higher than AI plans (mean 7.62 vs. 7.16; p=0.0313), ranking GPT-4o 6th (mean 7.38) and the reasoning model, o3, 11th (mean 6.97). Conversely, the AI judge produced a complete inversion, scoring AI plans significantly higher than human plans (mean 7.75 vs. 6.79; p=0.0313). It ranked o3 1st (mean 8.20) and GPT-4o 2nd, placing all human experts lower. Conclusions: The perceived quality of a clinical plan is fundamentally dependent on the evaluator's nature. An advanced reasoning AI, ranked poorly by human experts, was judged as superior by a sophisticated AI, revealing a deep gap between experience-based clinical heuristics and data-driven algorithmic logic. This paradox presents a critical challenge for AI integration, suggesting the future requires synergistic, explainable human-AI systems that bridge this reasoning gap to augment clinical care.
- Asia > India > West Bengal > Kolkata (0.05)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.87)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Bayesian generative models can flag performance loss, bias, and out-of-distribution image content
López-Pérez, Miguel, Miani, Marco, Naranjo, Valery, Hauberg, Søren, Feragen, Aasa
Generative models are popular for medical imaging tasks such as anomaly detection, feature extraction, data visualization, or image generation. Since they are parameterized by deep learning models, they are often sensitive to distribution shifts and unreliable when applied to out-of-distribution data, creating a risk of, e.g. underrepresentation bias. This behavior can be flagged using uncertainty quantification methods for generative models, but their availability remains limited. We propose SLUG: A new UQ method for VAEs that combines recent advances in Laplace approximations with stochastic trace estimators to scale gracefully with image dimensionality. We show that our UQ score -- unlike the VAE's encoder variances -- correlates strongly with reconstruction error and racial underrepresentation bias for dermatological images. We also show how pixel-wise uncertainty can detect out-of-distribution image content such as ink, rulers, and patches, which is known to induce learning shortcuts in predictive models.
- Europe > Denmark > Capital Region > Kongens Lyngby (0.14)
- Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
- Africa > Sub-Saharan Africa (0.04)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Health & Medicine > Therapeutic Area > Dermatology (0.72)
DermaSynth: Rich Synthetic Image-Text Pairs Using Open Access Dermatology Datasets
Yilmaz, Abdurrahim, Yuceyalcin, Furkan, Gokyayla, Ece, Choi, Donghee, Demircali, Ozan Erdem Ali Anil, Varol, Rahmetullah, Kirabali, Ufuk Gorkem, Gencoglan, Gulsum, Posma, Joram M., Temelkuran, Burak
A major barrier to developing vision large language models (LLMs) in dermatology is the lack of large image--text pairs dataset. We introduce DermaSynth, a dataset comprising of 92,020 synthetic image--text pairs curated from 45,205 images (13,568 clinical and 35,561 dermatoscopic) for dermatology-related clinical tasks. Leveraging state-of-the-art LLMs, using Gemini 2.0, we used clinically related prompts and self-instruct method to generate diverse and rich synthetic texts. Metadata of the datasets were incorporated into the input prompts by targeting to reduce potential hallucinations. The resulting dataset builds upon open access dermatological image repositories (DERM12345, BCN20000, PAD-UFES-20, SCIN, and HIBA) that have permissive CC-BY-4.0 licenses. We also fine-tuned a preliminary Llama-3.2-11B-Vision-Instruct model, DermatoLlama 1.0, on 5,000 samples. We anticipate this dataset to support and accelerate AI research in dermatology. Data and code underlying this work are accessible at https://github.com/abdurrahimyilmaz/DermaSynth.
- Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
- Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
- South America > Argentina (0.04)
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
VisTabNet: Adapting Vision Transformers for Tabular Data
Wydmański, Witold, Movsum-zada, Ulvi, Tabor, Jacek, Śmieja, Marek
Although deep learning models have had great success in natural language processing and computer vision, we do not observe comparable improvements in the case of tabular data, which is still the most common data type used in biological, industrial and financial applications. In particular, it is challenging to transfer large-scale pre-trained models to downstream tasks defined on small tabular datasets. To address this, we propose VisTabNet -- a cross-modal transfer learning method, which allows for adapting Vision Transformer (ViT) with pre-trained weights to process tabular data. By projecting tabular inputs to patch embeddings acceptable by ViT, we can directly apply a pre-trained Transformer Encoder to tabular inputs. This approach eliminates the conceptual cost of designing a suitable architecture for processing tabular data, while reducing the computational cost of training the model from scratch. Experimental results on multiple small tabular datasets (less than 1k samples) demonstrate VisTabNet's superiority, outperforming both traditional ensemble methods and recent deep learning models. The proposed method goes beyond conventional transfer learning practice and shows that pre-trained image models can be transferred to solve tabular problems, extending the boundaries of transfer learning.
- North America > United States > Wisconsin (0.04)
- Europe > Poland > Lesser Poland Province > Kraków (0.04)
- Health & Medicine (0.69)
- Information Technology > Software (0.34)
Towards Scalable Foundation Models for Digital Dermatology
Gröger, Fabian, Gottfrois, Philippe, Amruthalingam, Ludovic, Gonzalez-Jimenez, Alvaro, Lionetti, Simone, Soenksen-Martinez, Luis R., Navarini, Alexander A., Pouly, Marc
The growing demand for accurate and equitable AI models in digital dermatology faces a significant challenge: the lack of diverse, high-quality labeled data. In this work, we investigate the potential of domain-specific foundation models for dermatology in addressing this challenge. We utilize self-supervised learning (SSL) techniques to pre-train models on a dataset of over 240,000 dermatological images from public and private collections. Our study considers several SSL methods and compares the resulting foundation models against domain-agnostic models like those pre-trained on ImageNet and state-of-the-art models such as MONET across 12 downstream tasks. Unlike previous research, we emphasize the development of smaller models that are more suitable for resource-limited clinical settings, facilitating easier adaptation to a broad range of use cases. Results show that models pre-trained in this work not only outperform general-purpose models but also approach the performance of models 50 times larger on clinically relevant diagnostic tasks. To promote further research in this direction, we publicly release both the training code and the foundation models, which can benefit clinicians in dermatological applications.
- Europe > Switzerland > Basel-City > Basel (0.04)
- Europe > Portugal (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)