AITopics | malignant neoplasm

Collaborating Authors

malignant neoplasm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Forecasting Oncology Demand Trends with Boosting-Based Bayesian Conjugate Models

Neto, Ademir Batista dos Santos, Ferreira, Tiago Alessandro Espinola, Firmino, Paulo Renato Alves

arXiv.org Machine LearningMay-8-2026

Accurate trend forecasting in healthcare time series is essential for planning and resource allocation. This paper proposes a Bayesian framework for predicting oncology demand trends, modeling weekly appointments as a Poisson process with a Gamma prior to the demand rate. To enhance adaptability and capture persistent directional patterns, we incorporate a residual-based boosting mechanism grounded in a Gamma-Log-Normal conjugate structure. This boosting approach allows the model to track both short- and long-term trend shifts while maintaining the analytical tractability of conjugate Bayesian updating. The methodology was evaluated on real oncology service data from Cariri, Ceara, Brazil, and compared against established baselines, including linear regression, ARIMA, naive forecasting, LSTM neural networks, and XGBoost. Results showed that the proposed model outperforms competing methods in trend detection accuracy, with gains in terms of percentage of correct direction of 38.25% in relation to the second best approach in some cases.

artificial intelligence, forecasting, machine learning, (17 more...)

arXiv.org Machine Learning

2605.0527

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.27)
South America > Brazil > Ceará (0.25)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Add feedback

MIMIC-SR-ICD11: A Dataset for Narrative-Based Diagnosis

Wu, Yuexin, Wang, Shiqi, Rus, Vasile

arXiv.org Artificial IntelligenceNov-10-2025

Disease diagnosis is a central pillar of modern healthcare, enabling early detection and timely intervention for acute conditions while guiding lifestyle adjustments and medication regimens to prevent or slow chronic disease. Self-reports preserve clinically salient signals that templated electronic health record (EHR) documentation often attenuates or omits, especially subtle but consequential details. To operationalize this shift, we introduce MIMIC-SR-ICD11, a large English diagnostic dataset built from EHR discharge notes and natively aligned to WHO ICD-11 terminology. We further present LL-Rank, a likelihood-based re-ranking framework that computes a length-normalized joint likelihood of each label given the clinical report context and subtracts the corresponding report-free prior likelihood for that label. Across seven model backbones, LL-Rank consistently outperforms a strong generation-plus-mapping baseline (GenMap). Ablation experiments show that LL-Rank's gains primarily stem from its PMI-based scoring, which isolates semantic compatibility from label frequency bias.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.05485

Country:

North America > United States (0.46)
Asia > Middle East > UAE (0.28)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Nephrology (1.00)
(10 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Revealing Interconnections between Diseases: from Statistical Methods to Large Language Models

Ermilova, Alina, Kornilov, Dmitrii, Samoilova, Sofia, Laptenkova, Ekaterina, Kolesnikova, Anastasia, Podplutova, Ekaterina, Sofya, Senotrusova, Sharaev, Maksim G.

arXiv.org Artificial IntelligenceOct-13-2025

Identifying disease interconnections through manual analysis of large-scale clinical data is labor-intensive, subjective, and prone to expert disagreement. While machine learning (ML) shows promise, three critical challenges remain: (1) selecting optimal methods from the vast ML landscape, (2) determining whether real-world clinical data (e.g., electronic health records, EHRs) or structured disease descriptions yield more reliable insights, (3) the lack of "ground truth," as some disease interconnections remain unexplored in medicine. Large language models (LLMs) demonstrate broad utility, yet they often lack specialized medical knowledge. Our framework integrates the following: (i) a statistical co-occurrence analysis and a masked language modeling (MLM) approach using real clinical data; (ii) domain-specific BERT variants (Med-BERT and BioClinicalBERT); (iii) a general-purpose BERT and document retrieval; and (iv) four LLMs (Mistral, DeepSeek, Qwen, and Y andexGPT). Our graph-based comparison of the obtained interconnection matrices shows that the LLM-based approach produces interconnections with the lowest diversity of ICD code connections to different diseases compared to other methods, including text-based and domain-based approaches. This suggests an important implication: LLMs have limited potential for discovering new interconnections. In the absence of ground truth databases for medical interconnections between ICD codes, our results constitute a valuable medical disease ontology that can serve as a founda-tional resource for future clinical research and artificial intelligence applications in healthcare. Electronic health records (EHRs) provide a valuable resource for studying disease progression and relationships between diagnoses. Machine learning (ML) can help discover hidden patterns in medical data, but many existing models are hard to interpret. In particular, it is not always clear whether large language models (LLMs) make predictions based on meaningful medical knowledge or simply rely on textual similarities between diagnosis descriptions (Cui et al., 2025). This is especially critical in healthcare, where model decisions must align with established medical knowledge and pathophysiological mechanisms. We also analyze and compare the obtained results and summarize it into medical disease ontology.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.04888

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Therapeutic Area > Endocrinology (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards Collaborative Fairness in Federated Learning Under Imbalanced Covariate Shift

Yu, Tianrun, Wang, Jiaqi, Wang, Haoyu, Lin, Mingquan, Liu, Han, Yee, Nelson S., Ma, Fenglong

arXiv.org Artificial IntelligenceJul-14-2025

Collaborative fairness is a crucial challenge in federated learning. However, existing approaches often overlook a practical yet complex form of heterogeneity: imbalanced covariate shift. We provide a theoretical analysis of this setting, which motivates the design of FedAKD (Federated Asynchronous Knowledge Distillation)- simple yet effective approach that balances accurate prediction with collaborative fairness. FedAKD consists of client and server updates. In the client update, we introduce a novel asynchronous knowledge distillation strategy based on our preliminary analysis, which reveals that while correctly predicted samples exhibit similar feature distributions across clients, incorrectly predicted samples show significant variability. This suggests that imbalanced covariate shift primarily arises from misclassified samples. Leveraging this insight, our approach first applies traditional knowledge distillation to update client models while keeping the global model fixed. Next, we select correctly predicted high-confidence samples and update the global model using these samples while keeping client models fixed. The server update simply aggregates all client models. We further provide a theoretical proof of FedAKD's convergence. Experimental results on public datasets (FashionMNIST and CIFAR10) and a real-world Electronic Health Records (EHR) dataset demonstrate that FedAKD significantly improves collaborative fairness, enhances predictive accuracy, and fosters client participation even under highly heterogeneous data distributions.

artificial intelligence, dataset, machine learning, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3711896.3737161

2507.08617

Country:

North America > United States > Minnesota (0.28)
North America > Canada > Ontario > Toronto (0.15)

Genre: Research Report > New Finding (0.45)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Bridging Data Gaps of Rare Conditions in ICU: A Multi-Disease Adaptation Approach for Clinical Prediction

Zhu, Mingcheng, Liu, Yu, Luo, Zhiyao, Zhu, Tingting

arXiv.org Artificial IntelligenceJul-10-2025

Artificial Intelligence has revolutionised critical care for common conditions. Yet, rare conditions in the intensive care unit (ICU), including recognised rare diseases and low-prevalence conditions in the ICU, remain underserved due to data scarcity and intra-condition heterogeneity. To bridge such gaps, we developed KnowRare, a domain adaptation-based deep learning framework for predicting clinical outcomes for rare conditions in the ICU. KnowRare mitigates data scarcity by initially learning condition-agnostic representations from diverse electronic health records through self-supervised pre-training. It addresses intra-condition heterogeneity by selectively adapting knowledge from clinically similar conditions with a developed condition knowledge graph. Evaluated on two ICU datasets across five clinical prediction tasks (90-day mortality, 30-day readmission, ICU mortality, remaining length of stay, and phenotyping), KnowRare consistently outperformed existing state-of-the-art models. Additionally, KnowRare demonstrated superior predictive performance compared to established ICU scoring systems, including APACHE IV and IV-a. Case studies further demonstrated KnowRare's flexibility in adapting its parameters to accommodate dataset-specific and task-specific characteristics, its generalisation to common conditions under limited data scenarios, and its rationality in selecting source conditions. These findings highlight KnowRare's potential as a robust and practical solution for supporting clinical decision-making and improving care for rare conditions in the ICU.

artificial intelligence, machine learning, malignant neoplasm, (15 more...)

arXiv.org Artificial Intelligence

2507.06432

Genre:

Research Report > Experimental Study (0.69)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
(11 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Zero-shot Medical Event Prediction Using a Generative Pre-trained Transformer on Electronic Health Records

Redekop, Ekaterina, Wang, Zichen, Kulkarni, Rushikesh, Pleasure, Mara, Chin, Aaron, Hassanzadeh, Hamid Reza, Hill, Brian L., Emami, Melika, Speier, William, Arnold, Corey W.

arXiv.org Artificial IntelligenceMar-7-2025

Longitudinal data in electronic health records (EHRs) represent an individual`s clinical history through a sequence of codified concepts, including diagnoses, procedures, medications, and laboratory tests. Foundational models, such as generative pre-trained transformers (GPT), can leverage this data to predict future events. While fine-tuning of these models enhances task-specific performance, it is costly, complex, and unsustainable for every target. We show that a foundation model trained on EHRs can perform predictive tasks in a zero-shot manner, eliminating the need for fine-tuning. This study presents the first comprehensive analysis of zero-shot forecasting with GPT-based foundational models in EHRs, introducing a novel pipeline that formulates medical concept prediction as a generative modeling task. Unlike supervised approaches requiring extensive labeled data, our method enables the model to forecast a next medical event purely from a pretraining knowledge. We evaluate performance across multiple time horizons and clinical categories, demonstrating model`s ability to capture latent temporal dependencies and complex patient trajectories without task supervision. Model performance for predicting the next medical concept was evaluated using precision and recall metrics, achieving an average top1 precision of 0.614 and recall of 0.524. For 12 major diagnostic conditions, the model demonstrated strong zero-shot performance, achieving high true positive rates while maintaining low false positives. We demonstrate the power of a foundational EHR GPT model in capturing diverse phenotypes and enabling robust, zero-shot forecasting of clinical outcomes. This capability enhances the versatility of predictive healthcare models and reduces the need for task-specific training, enabling more scalable applications in clinical settings.

diagnosis, disorder, prediction window, (14 more...)

arXiv.org Artificial Intelligence

2503.05893

Country: North America > United States > California > Los Angeles County > Los Angeles (0.31)

Genre: Research Report > Experimental Study (0.66)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Advancing Pancreatic Cancer Prediction with a Next Visit Token Prediction Head on top of Med-BERT

He, Jianping, Rasmy, Laila, Zhi, Degui, Tao, Cui

arXiv.org Artificial IntelligenceJan-3-2025

Background: Recently, numerous foundation models pretrained on extensive data have demonstrated efficacy in disease prediction using Electronic Health Records (EHRs). However, there remains some unanswered questions on how to best utilize such models especially with very small fine-tuning cohorts. Methods: We utilized Med-BERT, an EHR-specific foundation model, and reformulated the disease binary prediction task into a token prediction task and a next visit mask token prediction task to align with Med-BERT's pretraining task format in order to improve the accuracy of pancreatic cancer (PaCa) prediction in both few-shot and fully supervised settings. Results: The reformulation of the task into a token prediction task, referred to as Med-BERT-Sum, demonstrates slightly superior performance in both few-shot scenarios and larger data samples. Furthermore, reformulating the prediction task as a Next Visit Mask Token Prediction task (Med-BERT-Mask) significantly outperforms the conventional Binary Classification (BC) prediction task (Med-BERT-BC) by 3% to 7% in few-shot scenarios with data sizes ranging from 10 to 500 samples. These findings highlight that aligning the downstream task with Med-BERT's pretraining objectives substantially enhances the model's predictive capabilities, thereby improving its effectiveness in predicting both rare and common diseases. Conclusion: Reformatting disease prediction tasks to align with the pretraining of foundation models enhances prediction accuracy, leading to earlier detection and timely intervention. This approach improves treatment effectiveness, survival rates, and overall patient outcomes for PaCa and potentially other cancers.

machine learning, med-bert, natural language, (20 more...)

arXiv.org Artificial Intelligence

2501.02044

Country: North America > United States (0.93)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology > Pancreatic Cancer (0.73)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MedG-KRP: Medical Graph Knowledge Representation Probing

Rosenbaum, Gabriel R., Jiang, Lavender Yao, Sheth, Ivaxi, Stryker, Jaden, Alyakin, Anton, Alber, Daniel Alexander, Goff, Nicolas K., Kwon, Young Joon Fred, Markert, John, Nasir-Moin, Mustafa, Niehues, Jan Moritz, Sangwon, Karl L., Yang, Eunice, Oermann, Eric Karl

arXiv.org Artificial IntelligenceDec-16-2024

Large language models (LLMs) have recently emerged as powerful tools, finding many medical applications. LLMs' ability to coalesce vast amounts of information from many sources to generate a response-a process similar to that of a human expert-has led many to see potential in deploying LLMs for clinical use. However, medicine is a setting where accurate reasoning is paramount. Many researchers are questioning the effectiveness of multiple choice question answering (MCQA) benchmarks, frequently used to test LLMs. Researchers and clinicians alike must have complete confidence in LLMs' abilities for them to be deployed in a medical setting. To address this need for understanding, we introduce a knowledge graph (KG)-based method to evaluate the biomedical reasoning abilities of LLMs. Essentially, we map how LLMs link medical concepts in order to better understand how they reason. We test GPT-4, Llama3-70b, and PalmyraMed-70b, a specialized medical model. We enlist a panel of medical students to review a total of 60 LLM-generated graphs and compare these graphs to BIOS, a large biomedical KG. We observe GPT-4 to perform best in our human review but worst in our ground truth comparison; vice-versa with PalmyraMed, the medical model. Our work provides a means of visualizing the medical reasoning pathways of LLMs so they can be implemented in clinical settings safely and effectively.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2412.10982

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Alabama > Jefferson County > Birmingham (0.14)
North America > United States > New York > New York County > New York City (0.05)
(5 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.97)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.95)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

Explainable machine learning for neoplasms diagnosis via electrocardiograms: an externally validated study

Alcaraz, Juan Miguel Lopez, Haverkamp, Wilhelm, Strodthoff, Nils

arXiv.org Artificial IntelligenceDec-10-2024

Background: Neoplasms remains a leading cause of mortality worldwide, with timely diagnosis being crucial for improving patient outcomes. Current diagnostic methods are often invasive, costly, and inaccessible to many populations. Electrocardiogram (ECG) data, widely available and non-invasive, has the potential to serve as a tool for neoplasms diagnosis by using physiological changes in cardiovascular function associated with neoplastic prescences. Methods: This study explores the application of machine learning models to analyze ECG features for the diagnosis of neoplasms. We developed a pipeline integrating tree-based models with Shapley values for explainability. The model was trained and internally validated and externally validated on a second large-scale independent external cohort to ensure robustness and generalizability. Findings: The results demonstrate that ECG data can effectively capture neoplasms-associated cardiovascular changes, achieving high performance in both internal testing and external validation cohorts. Shapley values identified key ECG features influencing model predictions, revealing established and novel cardiovascular markers linked to neoplastic conditions. This non-invasive approach provides a cost-effective and scalable alternative for the diagnosis of neoplasms, particularly in resource-limited settings. Similarly, useful for the management of secondary cardiovascular effects given neoplasms therapies. Interpretation: This study highlights the feasibility of leveraging ECG signals and machine learning to enhance neoplasms diagnostics. By offering interpretable insights into cardio-neoplasms interactions, this approach bridges existing gaps in non-invasive diagnostics and has implications for integrating ECG-based tools into broader neoplasms diagnostic frameworks, as well as neoplasms therapy management.

artificial intelligence, machine learning, neoplasm, (15 more...)

arXiv.org Artificial Intelligence

2412.07737

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Germany > Lower Saxony > Oldenburg (0.04)
Europe > Germany > Berlin (0.04)
(4 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Panacea: A foundation model for clinical trial search, summarization, design, and recruitment

Lin, Jiacheng, Xu, Hanwen, Wang, Zifeng, Wang, Sheng, Sun, Jimeng

arXiv.org Artificial IntelligenceJun-25-2024

Clinical trials are fundamental in developing new drugs, medical devices, and treatments. However, they are often time-consuming and have low success rates. Although there have been initial attempts to create large language models (LLMs) for clinical trial design and patient-trial matching, these models remain task-specific and not adaptable to diverse clinical trial tasks. To address this challenge, we propose a clinical trial foundation model named Panacea, designed to handle multiple tasks, including trial search, trial summarization, trial design, and patient-trial matching. We also assemble a large-scale dataset, named TrialAlign, of 793,279 trial documents and 1,113,207 trial-related scientific papers, to infuse clinical knowledge into the model by pre-training. We further curate TrialInstruct, which has 200,866 of instruction data for fine-tuning. These resources enable Panacea to be widely applicable for a range of clinical trial tasks based on user requirements. We evaluated Panacea on a new benchmark, named TrialPanorama, which covers eight clinical trial tasks. Our method performed the best on seven of the eight tasks compared to six cutting-edge generic or medicine-specific LLMs. Specifically, Panacea showed great potential to collaborate with human experts in crafting the design of eligibility criteria, study arms, and outcome measures, in multi-round conversations. In addition, Panacea achieved 14.42% improvement in patient-trial matching, 41.78% to 52.02% improvement in trial search, and consistently ranked at the top for five aspects of trial summarization. Our approach demonstrates the effectiveness of Panacea in clinical trials and establishes a comprehensive resource, including training data, model, and benchmark, for developing clinical trial foundation models, paving the path for AI-based clinical trial development.

clinical trial, criteria, panacea, (15 more...)

arXiv.org Artificial Intelligence

2407.11007

Country:

North America > United States > Washington > King County > Seattle (0.14)
Oceania > New Zealand (0.04)
North America > Canada (0.04)
(14 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
(10 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback