AITopics | cancer dataset

Collaborating Authors

cancer dataset

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Cancer Survival Analysis via Zero-shot Tumor Microenvironment Segmentation on Low-resolution Whole Slide Pathology Images

Neural Information Processing SystemsJun-17-2026, 07:19:14 GMT

The whole-slide pathology images (WSIs) are widely recognized as the golden standard for cancer survival analysis. However, due to the high-resolution of WSIs, the existing studies require dividing WSIs into patches and identify key components before building the survival prediction system, which is time-consuming and cannot reflect the overall spatial organization of WSIs. Inspired by the fact that the spatial interactions among different tumor microenvironment (TME) components in WSIs are associated with the cancer prognosis, some studies attempt to capture the complex interactions among different TME components to improve survival predictions. However, they require extra efforts for building the TME segmentation model, which involves substantial annotation workloads on different TME components and is independent to the construction of the survival prediction model. To address the above issues, we propose ZTSurv, a novel end-to-end cancer survival analysis framework via efficient zero-shot TME segmentation on low-resolution WSIs. Specifically, by leveraging tumor infiltrating lymphocyte (TIL) maps on the 50x down-sampled WSIs, ZTSurv enables zero-shot segmentation on other two important TME components (i.e., tumor and stroma) that can reduce the annotation efforts from the pathologists. Then, based on the visual and semantic information extracted from different TME components, we construct a heterogeneous graph to capture their spatial intersections for clinical outcome prediction. We validate ZTSurv across four cancer cohorts derived from The Cancer Genome Atlas (TCGA), and the experimental results indicate that our method can not only achieve superior prediction results but also significantly reduce the computational costs in comparison with the state-of-the-art methods.

large language model, machine learning, tme component, (20 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SupplementaryMaterial: ModelClassReliancefor RandomForests

Neural Information Processing SystemsFeb-11-2026, 06:08:05 GMT

The packages developed as part of this work are discussed below and made available via the above notebooks. This simply calls the code fromhttps://github.com/charliemarx/ Figure 1 shows the the diagnostic graphs as considered in [4]. Note that the notebook does not haveafixedseed and this instability can beexplored by re-runningthenotebook. SHAP values are calculated on an identical RandomForestClassifier as used for the RF MCR. Thegraphs generated bytheNotebooks areperMCR estimation method, rather thanthe comparison graphs shown in the paper.

artificial intelligence, github, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

SupplementaryMaterialsVIME: ExtendingtheSuccessofSelf-and Semi-supervisedLearningtoTabularDomain

Neural Information Processing SystemsFeb-9-2026, 02:59:49 GMT

Semisupervised learning uses the trained encoder in learning a predictive model on both labeled and unlabeleddata. Figure 3: The proposed data corruption procedure. Original feature matrix(X) consists of four samples xi,i = 1...,4, where each row/column represents a sample/feature, and the features in each sample are represented by the same color. In the experiment section of the main manuscript, we evaluate VIME and its benchmarks on 11 datasets(6genomics,2clinical,and3publicdatasets). The selected SNPs and the corresponding blood cell trait together form an independent labeled dataset.

artificial intelligence, dataset, machine learning, (12 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > United Kingdom (0.04)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.30)

Add feedback

Smart Trial: Evaluating the Use of Large Language Models for Recruiting Clinical Trial Participants via Social Media

Zhou, Xiaofan, Wang, Zisu, Krieger, Janice, Zalake, Mohan, Cheng, Lu

arXiv.org Artificial IntelligenceSep-16-2025

Clinical trials (CT) are essential for advancing medical research and treatment, yet efficiently recruiting eligible participants -- each of whom must meet complex eligibility criteria -- remains a significant challenge. Traditional recruitment approaches, such as advertisements or electronic health record screening within hospitals, are often time-consuming and geographically constrained. This work addresses the recruitment challenge by leveraging the vast amount of health-related information individuals share on social media platforms. With the emergence of powerful large language models (LLMs) capable of sophisticated text understanding, we pose the central research question: Can LLM-driven tools facilitate CT recruitment by identifying potential participants through their engagement on social media? To investigate this question, we introduce TRIALQA, a novel dataset comprising two social media collections from the subreddits on colon cancer and prostate cancer. Using eligibility criteria from public real-world CTs, experienced annotators are hired to annotate TRIALQA to indicate (1) whether a social media user meets a given eligibility criterion and (2) the user's stated reasons for interest in participating in CT. We benchmark seven widely used LLMs on these two prediction tasks, employing six distinct training and inference strategies. Our extensive experiments reveal that, while LLMs show considerable promise, they still face challenges in performing the complex, multi-hop reasoning needed to accurately assess eligibility criteria.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2509.10584

Country: North America > United States > Illinois (0.15)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.93)
Health & Medicine > Therapeutic Area > Oncology > Colorectal Cancer (0.52)
Health & Medicine > Therapeutic Area > Oncology > Prostate Cancer (0.37)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Stress-testing cross-cancer generalizability of 3D nnU-Net for PET-CT tumor segmentation: multi-cohort evaluation with novel oesophageal and lung cancer datasets

Ghosh, Soumen, Hannan, Christine Jestin, Vashistha, Rajat, Kundu, Parveen, Brosda, Sandra, Aoude, Lauren G., Lonie, James, Nathanson, Andrew, Ng, Jessica, Barbour, Andrew P., Vegh, Viktor

arXiv.org Artificial IntelligenceAug-27-2025

Robust generalization is essential for deploying deep learning based tumor segmentation in clinical PET-CT workflows, where anatomical sites, scanners, and patient populations vary widely. This study presents the first cross cancer evaluation of nnU-Net on PET-CT, introducing two novel, expert-annotated whole-body datasets. 279 patients with oesophageal cancer (Australian cohort) and 54 with lung cancer (Indian cohort). These cohorts complement the public AutoPET dataset and enable systematic stress-testing of cross domain performance. We trained and tested 3D nnUNet models under three paradigms. Target only (oesophageal), public only (AutoPET), and combined training. For the tested sets, the oesophageal only model achieved the best in-domain accuracy (mean DSC, 57.8) but failed on external Indian lung cohort (mean DSC less than 3.4), indicating severe overfitting. The public only model generalized more broadly (mean DSC, 63.5 on AutoPET, 51.6 on Indian lung cohort) but underperformed in oesophageal Australian cohort (mean DSC, 26.7). The combined approach provided the most balanced results (mean DSC, lung (52.9), oesophageal (40.7), AutoPET (60.9)), reducing boundary errors and improving robustness across all cohorts. These findings demonstrate that dataset diversity, particularly multi demographic, multi center and multi cancer integration, outweighs architectural novelty as the key driver of robust generalization. This work presents the demography based cross cancer deep learning segmentation evaluation and highlights dataset diversity, rather than model complexity, as the foundation for clinically robust segmentation.

artificial intelligence, machine learning, segmentation, (19 more...)

arXiv.org Artificial Intelligence

2508.18612

Country: Oceania > Australia > Queensland (0.16)

Genre:

Research Report > New Finding (0.88)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Oncology > Lung Cancer (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reliable Radiologic Skeletal Muscle Area Assessment -- A Biomarker for Cancer Cachexia Diagnosis

Ahmed, Sabeen, Parker, Nathan, Park, Margaret, Jeong, Daniel, Peres, Lauren, Davis, Evan W., Permuth, Jennifer B., Siegel, Erin, Schabath, Matthew B., Yilmaz, Yasin, Rasool, Ghulam

arXiv.org Artificial IntelligenceMar-19-2025

Cancer cachexia is a common metabolic disorder characterized by severe muscle atrophy which is associated with poor prognosis and quality of life. Monitoring skeletal muscle area (SMA) longitudinally through computed tomography (CT) scans, an imaging modality routinely acquired in cancer care, is an effective way to identify and track this condition. However, existing tools often lack full automation and exhibit inconsistent accuracy, limiting their potential for integration into clinical workflows. To address these challenges, we developed SMAART-AI (Skeletal Muscle Assessment-Automated and Reliable Tool-based on AI), an end-to-end automated pipeline powered by deep learning models (nnU-Net 2D) trained on mid-third lumbar level CT images with 5-fold cross-validation, ensuring generalizability and robustness. SMAART-AI incorporates an uncertainty-based mechanism to flag high-error SMA predictions for expert review, enhancing reliability. We combined the SMA, skeletal muscle index, BMI, and clinical data to train a multi-layer perceptron (MLP) model designed to predict cachexia at the time of cancer diagnosis. Tested on the gastroesophageal cancer dataset, SMAART-AI achieved a Dice score of 97.80% +/- 0.93%, with SMA estimated across all four datasets in this study at a median absolute error of 2.48% compared to manual annotations with SliceOmatic. Uncertainty metrics-variance, entropy, and coefficient of variation-strongly correlated with SMA prediction errors (0.83, 0.76, and 0.73 respectively). The MLP model predicts cachexia with 79% precision, providing clinicians with a reliable tool for early diagnosis and intervention. By combining automation, accuracy, and uncertainty awareness, SMAART-AI bridges the gap between research and clinical application, offering a transformative approach to managing cancer cachexia.

artificial intelligence, machine learning, smaart-ai, (18 more...)

arXiv.org Artificial Intelligence

2503.16556

Country:

North America > United States > Florida > Hillsborough County > Tampa (0.28)
North America > United States > California > Santa Clara County > Santa Clara (0.04)
Asia (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Oncology > Pancreatic Cancer (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SHAP-Integrated Convolutional Diagnostic Networks for Feature-Selective Medical Analysis

Hu, Yan, Chaddad, Ahmad

arXiv.org Artificial IntelligenceMar-10-2025

This study introduces the SHAP-integrated convolutional diagnostic network (SICDN), an interpretable feature selection method designed for limited datasets, to address the challenge posed by data privacy regulations that restrict access to medical datasets. The SICDN model was tested on classification tasks using pneumonia and breast cancer datasets, demonstrating over 97% accuracy and surpassing four popular CNN models. We also integrated a historical weighted moving average technique to enhance feature selection. The SICDN shows potential in medical image prediction, with the code available on https://github.com/AIPMLab/SICDN.

classification task, dataset, sicdn, (13 more...)

arXiv.org Artificial Intelligence

2503.08712

Country:

Asia > China (0.05)
North America > Canada (0.04)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.57)
Health & Medicine > Diagnostic Medicine > Imaging (0.35)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Optimizing Sparse Generalized Singular Vectors for Feature Selection in Proximal Support Vector Machines with Application to Breast and Ovarian Cancer Detection

Ugwu, Ugochukwu O., Kirby, Michael

arXiv.org Machine LearningOct-4-2024

This paper presents approaches to compute sparse solutions of Generalized Singular Value Problem (GSVP). The GSVP is regularized by $\ell_1$-norm and $\ell_q$-penalty for $0

cancer dataset, dataset, ovarian cancer dataset, (13 more...)

arXiv.org Machine Learning

2410.03978

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Massachusetts > Middlesex County > Medford (0.04)
North America > United States > Colorado (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Oncology > Ovarian Cancer (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Filters

Collaborating Authors

cancer dataset

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Cancer Survival Analysis via Zero-shot Tumor Microenvironment Segmentation on Low-resolution Whole Slide Pathology Images

d599b81036fd1a3b3949b7d444f31082-Supplemental-Conference.pdf

SupplementaryMaterial: ModelClassReliancefor RandomForests

SupplementaryMaterialsVIME: ExtendingtheSuccessofSelf-and Semi-supervisedLearningtoTabularDomain

d599b81036fd1a3b3949b7d444f31082-Supplemental-Conference.pdf

Smart Trial: Evaluating the Use of Large Language Models for Recruiting Clinical Trial Participants via Social Media

Stress-testing cross-cancer generalizability of 3D nnU-Net for PET-CT tumor segmentation: multi-cohort evaluation with novel oesophageal and lung cancer datasets

Reliable Radiologic Skeletal Muscle Area Assessment -- A Biomarker for Cancer Cachexia Diagnosis

SHAP-Integrated Convolutional Diagnostic Networks for Feature-Selective Medical Analysis

Optimizing Sparse Generalized Singular Vectors for Feature Selection in Proximal Support Vector Machines with Application to Breast and Ovarian Cancer Detection