dice
Multi-view Masked Contrastive Representation Learning for Endoscopic Video Analysis
Endoscopic video analysis can effectively assist clinicians in disease diagnosis and treatment, and has played an indispensable role in clinical medicine. Unlike regular videos, endoscopic video analysis presents unique challenges, including complex camera movements, uneven distribution of lesions, and concealment, and it typically relies on contrastive learning in self-supervised pretraining as its mainstream technique. However, representations obtained from contrastive learning enhance the discriminability of the model but often lack fine-grained information, which is suboptimal in the pixel-level prediction tasks. In this paper, we develop a Multi-view Masked Contrastive Representation Learning (M$^2$CRL) framework for endoscopic video pre-training. Specifically, we propose a multi-view mask strategy for addressing the challenges of endoscopic videos. We utilize the frame-aggregated attention guided tube mask to capture global-level spatiotemporal sensitive representation from the global views, while the random tube mask is employed to focus on local variations from the local views. Subsequently, we combine multi-view mask modeling with contrastive learning to obtain endoscopic video representations that possess fine-grained perception and holistic discriminative capabilities simultaneously. The proposed M$^2$CRL is pre-trained on 7 publicly available endoscopic video datasets and fine-tuned on 3 endoscopic video datasets for 3 downstream tasks. Notably, our M$^2$CRL significantly outperforms the current state-of-the-art self-supervised endoscopic pre-training methods, e.g., Endo-FM (3.5% F1 for classification, 7.5% Dice for segmentation, and 2.2% F1 for detection) and other self-supervised methods, e.g., VideoMAE V2 (4.6% F1 for classification, 0.4% Dice for segmentation, and 2.1% F1 for detection).
Benchmarking Encoder-Decoder Architectures for Biplanar X-ray to 3D Bone Shape Reconstruction
Various deep learning models have been proposed for 3D bone shape reconstruction from two orthogonal (biplanar) X-ray images.However, it is unclear how these models compare against each other since they are evaluated on different anatomy, cohort and (often privately held) datasets.Moreover, the impact of the commonly optimized image-based segmentation metrics such as dice score on the estimation of clinical parameters relevant in 2D-3D bone shape reconstruction is not well known.To move closer toward clinical translation, we propose a benchmarking framework that evaluates tasks relevant to real-world clinical scenarios, including reconstruction of fractured bones, bones with implants, robustness to population shift, and error in estimating clinical parameters.Our open-source platform provides reference implementations of 8 models (many of whose implementations were not publicly available), APIs to easily collect and preprocess 6 public datasets, and the implementation of automatic clinical parameter and landmark extraction methods. We present an extensive evaluation of 8 2D-3D models on equal footing using 6 public datasets comprising images for four different anatomies.Our results show that attention-based methods that capture global spatial relationships tend to perform better across all anatomies and datasets; performance on clinically relevant subgroups may be overestimated without disaggregated reporting; ribs are substantially more difficult to reconstruct compared to femur, hip and spine; and the dice score improvement does not always bring corresponding improvement in the automatic estimation of clinically relevant parameters.
- North America > United States > California (0.04)
- Europe > Eastern Europe (0.04)
- Leisure & Entertainment > Games > Computer Games (1.00)
- Information Technology (1.00)
- Information Technology > Hardware (0.86)
- Information Technology > Artificial Intelligence > Games (0.50)
Hide-and-Seek Attribution: Weakly Supervised Segmentation of Vertebral Metastases in CT
Atad, Matan, Marka, Alexander W., Steinhelfer, Lisa, Curto-Vilalta, Anna, Leonhardt, Yannik, Foreman, Sarah C., Dietrich, Anna-Sophia Walburga, Graf, Robert, Gersing, Alexandra S., Menze, Bjoern, Rueckert, Daniel, Kirschke, Jan S., Möller, Hendrik
Accurate segmentation of vertebral metastasis in CT is clinically important yet difficult to scale, as voxel-level annotations are scarce and both lytic and blastic lesions often resemble benign degenerative changes. We introduce a weakly supervised method trained solely on vertebra-level healthy/malignant labels, without any lesion masks. The method combines a Diffusion Autoencoder (DAE) that produces a classifier-guided healthy edit of each vertebra with pixel-wise difference maps that propose candidate lesion regions. To determine which regions truly reflect malignancy, we introduce Hide-and-Seek Attribution: each candidate is revealed in turn while all others are hidden, the edited image is projected back to the data manifold by the DAE, and a latent-space classifier quantifies the isolated malignant contribution of that component. High-scoring regions form the final lytic or blastic segmentation. On held-out radiologist annotations, we achieve strong blastic/lytic performance despite no mask supervision (F1: 0.91/0.85; Dice: 0.87/0.78), exceeding baselines (F1: 0.79/0.67; Dice: 0.74/0.55). These results show that vertebra-level labels can be transformed into reliable lesion masks, demonstrating that generative editing combined with selective occlusion supports accurate weakly supervised segmentation in CT.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- (5 more...)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Nuclear Medicine (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
PULSE: A Unified Multi-Task Architecture for Cardiac Segmentation, Diagnosis, and Few-Shot Cross-Modality Clinical Adaptation
Ghouse, Hania, Alsharqi, Maryam, Nezami, Farhad R., Behzad, Muzammil
Cardiac image analysis remains fragmented across tasks: anatomical segmentation, disease classification, and grounded clinical report generation are typically handled by separate networks trained under different data regimes. No existing framework unifies these objectives within a single architecture while retaining generalization across imaging modalities and datasets. We introduce PULSE, a multi-task vision-language framework built on self-supervised representations and optimized through a composite supervision strategy that balances region overlap learning, pixel wise classification fidelity, and boundary aware IoU refinement. A multi-scale token reconstruction decoder enables anatomical segmentation, while shared global representations support disease classification and clinically grounded text output allowing the model to transition from pixels to structures and finally clinical reasoning within one architecture. Unlike prior task-specific pipelines, PULSE learns task-invariant cardiac priors, generalizes robustly across datasets, and can be adapted to new imaging modalities with minimal supervision. This moves the field closer to a scalable, foundation style cardiac analysis framework.
- Asia > Middle East > Saudi Arabia > Eastern Province > Dhahran (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Breast Cell Segmentation Under Extreme Data Constraints: Quantum Enhancement Meets Adaptive Loss Stabilization
Dasoju, Varun Kumar, Cheng, Qingsu, Yu, Zeyun
Annotating medical images demands significant time and expertise, often requiring pathologists to invest hundreds of hours in labeling mammary epithelial nuclei datasets. We address this critical challenge by achieving 95.5% Dice score using just 599 training images for breast cell segmentation, where just 4% of pixels represent breast tissue and 60% of images contain no breast regions. Our framework uses quantum-inspired edge enhancement via multi-scale Gabor filters creating a fourth input channel, enhancing boundary detection where inter-annotator variations reach +/- 3 pixels. We present a stabilized multi-component loss function that integrates adaptive Dice loss with boundary-aware terms and automatic positive weighting to effectively address severe class imbalance, where mammary epithelial cell regions comprise only 0.1%-20% of the total image area. Additionally, a complexity-based weighted sampling strategy is introduced to prioritize the challenging mammary epithelial cell regions. The model employs an EfficientNet-B7/UNet++ architecture with a 4-to-3 channel projection, enabling the use of pretrained weights despite limited medical imaging data. Finally, robust validation is achieved through exponential moving averaging and statistical outlier detection, ensuring reliable performance estimates on a small validation set (129 images). Our framework achieves a Dice score of 95.5% +/- 0.3% and an IoU of 91.2% +/- 0.4%. Notably, quantum-based enhancement contributes to a 2.1% improvement in boundary accuracy, while weighted sampling increases small lesion detection by 3.8%. By achieving groundbreaking performance with limited annotations, our approach significantly reduces the medical expert time required for dataset creation, addressing a fundamental bottleneck in clinical perception AI development.
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Augmenting The Weather: A Hybrid Counterfactual-SMOTE Algorithm for Improving Crop Growth Prediction When Climate Changes
Temraz, Mohammed, Keane, Mark T
In recent years, humanity has begun to experien ce the catastrophic effects of climate change as economic sectors (such as agriculture) struggle with unpredictable and extreme weather events. Artificial Intelligence (AI) should help us handle these climate challenges but its most promising solutions are not good at dealing with climate - disrupted data; specifically, machine learning methods that work from historical data - distributions, are not good at handling out - of - distribution, outlier events. In this paper, we propose a novel data augmentation method, that treats the predictive problems around climate change as being, in part, due to class - imbalance issues; that is, prediction from historical datasets is difficult because, by definition, they lack sufficient minority - class instances of "climate outlier events". This novel data augmentation method -- called Counterfactual - Based SMOTE (CFA - SMOTE) -- combines an instance - based counterfactual method from Explainable AI (XAI) with the well - known class - imbalance method, SMOTE. CFA - SMOTE creates synthetic dat a - points representing outlier, climate - events that augment the dataset to improve predictive performance. We report comparative experiments using this CFA - SMOTE method, comparing it to benchmark counterfactual and class - imbalance methods under different co nditions (i.e., class - imbalance ratios). The focal climate - change domain used relies on predicting grass growth on Irish dairy farms, during Europe - wide drought and forage crisis of 2018.
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Europe > Sweden (0.04)
- Europe > Norway (0.04)
- (3 more...)
- Research Report > New Finding (0.46)
- Research Report > Promising Solution (0.34)
- Health & Medicine (1.00)
- Government (1.00)
- Food & Agriculture > Agriculture (0.87)
We would like to thank the reviewers for their helpful feedback, and for their positive comments regarding the originality
We will clarify this formulation in the paper. Conjugate MDPs are an interesting framework for learning useful abstractions. Thank you for your feedback - we will improve the legibility of all figures with attention to B&W . "Ours" in Figure 2 is indeed using We are indeed using the best τ found for GAE when running with λ = 0 . We will clarify this also.
Towards Personalized Treatment Plan: Geometrical Model-Agnostic Approach to Counterfactual Explanations
Sin, Daniel, Toutounchian, Milad
In our article, we describe a method for generating counterfactual explanations in high-dimensional spaces using four steps that involve fitting our dataset to a model, finding the decision boundary, determining constraints on the problem, and computing the closest point (counterfactual explanation) from that boundary. We propose a discretized approach where we find many discrete points on the boundary and then identify the closest feasible counterfactual explanation. This method, which we later call $\textit{Segmented Sampling for Boundary Approximation}$ (SSBA), applies binary search to find decision boundary points and then searches for the closest boundary point. Across four datasets of varying dimensionality, we show that our method can outperform current methods for counterfactual generation with reductions in distance between $5\%$ to $50\%$ in terms of the $L_2$ norm. Our method can also handle real-world constraints by restricting changes to immutable and categorical features, such as age, gender, sex, height, and other related characteristics such as the case for a health-based dataset. In terms of runtime, the SSBA algorithm generates decision boundary points on multiple orders of magnitude in the same given time when we compare to a grid-based approach. In general, our method provides a simple and effective model-agnostic method that can compute nearest feasible (i.e. realistic with constraints) counterfactual explanations. All of our results and code are available at: https://github.com/dsin85691/SSBA_For_Counterfactuals
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
- North America > Canada (0.04)
A Omitted Proofs
In this section we include all the proofs deferred from the main body. Before we proceed with the proof of Theorem 2.3, let us point out Thus, the theorem follows directly from (6).Corollary 3.1. Suppose that both players employ (OMD) with learning rate η > 0. (t 1) (t 1) (t 1) (t 1) (t 1) (t 1) ( t 1) ( t 1) ( t 1) ( t 1) (t 1) ( t 1) ( t 1) (t 1) ( t 1) (t 1) ( t 1) ( t 1) (t 1) ( t 1) Thus, recalling Definition 2.5, the claim follows from (12) and (13). The sheriff's decision is binding only in the last If the cargo has no illegal items ( i.e. After the placement, players take turns at "firing" at their The game proceeds until either one player has sunk all of the opponent's ships, or At the end of the game, each player's payoff is the The latter modification makes the game general-sum, and incentivizes players to be more risk-averse.