harrell
Perch 2.0 transfers 'whale' to underwater tasks
Burns, Andrea, Harrell, Lauren, van Merriënboer, Bart, Dumoulin, Vincent, Hamer, Jenny, Denton, Tom
Perch 2.0 is a supervised bioacoustics foundation model pretrained on 14,597 species, including birds, mammals, amphibians, and insects, and has state-of-the-art performance on multiple benchmarks. Given that Perch 2.0 includes almost no marine mammal audio or classes in the training data, we evaluate Perch 2.0 performance on marine mammal and underwater audio tasks through few-shot transfer learning. We perform linear probing with the embeddings generated from this foundation model and compare performance to other pretrained bioacoustics models. In particular, we compare Perch 2.0 with previous multispecies whale, Perch 1.0, SurfPerch, AVES-bio, BirdAVES, and Birdnet V2.3 models, which have open-source tools for transfer-learning and agile modeling. We show that the embeddings from the Perch 2.0 model have consistently high performance for few-shot transfer learning, generally outperforming alternative embedding models on the majority of tasks, and thus is recommended when developing new linear classifiers for marine mammal classification with few labeled examples.
Beyond Cox Models: Assessing the Performance of Machine-Learning Methods in Non-Proportional Hazards and Non-Linear Survival Analysis
Rossi, Ivan, Sartori, Flavio, Rollo, Cesare, Birolo, Giovanni, Fariselli, Piero, Sanavia, Tiziana
Survival analysis often relies on Cox models, assuming both linearity and proportional hazards (PH). This study evaluates machine and deep learning methods that relax these constraints, comparing their performance with penalized Cox models on a benchmark of three synthetic and three real datasets. In total, eight different models were tested, including six non-linear models of which four were also non-PH. Although Cox regression often yielded satisfactory performance, we showed the conditions under which machine and deep learning models can perform better. Indeed, the performance of these methods has often been underestimated due to the improper use of Harrell's concordance index (C-index) instead of more appropriate scores such as Antolini's concordance index, which generalizes C-index in cases where the PH assumption does not hold. In addition, since occasionally high C-index models happen to be badly calibrated, combining Antolini's C-index with Brier's score is useful to assess the overall performance of a survival method. Results on our benchmark data showed that survival prediction should be approached by testing different methods to select the most appropriate one according to sample size, non-linearity and non-PH conditions. To allow an easy reproducibility of these tests on our benchmark data, code and documentation are freely available at https://github.com/compbiomed-unito/survhive.
- Europe > Italy > Piedmont > Turin Province > Turin (0.04)
- Europe > Netherlands > South Holland > Rotterdam (0.04)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Stop Chasing the C-index: This Is How We Should Evaluate Our Survival Models
Lillelund, Christian Marius, Qi, Shi-ang, Greiner, Russell, Pedersen, Christian Fischer
We argue that many survival analysis and time-to-event models are incorrectly evaluated. First, we survey many examples of evaluation approaches in the literature and find that most rely on concordance (C-index). However, the C-index only measures a model's discriminative ability and does not assess other important aspects, such as the accuracy of the time-to-event predictions or the calibration of the model's probabilistic estimates. Next, we present a set of key desiderata for choosing the right evaluation metric and discuss their pros and cons. These are tailored to the challenges in survival analysis, such as sensitivity to miscalibration and various censoring assumptions. We hypothesize that the current development of survival metrics conforms to a double-helix ladder, and that model validity and metric validity must stand on the same rung of the assumption ladder. Finally, we discuss the appropriate methods for evaluating a survival model in practice and summarize various viewpoints opposing our analysis.
- North America > Canada > Alberta (0.14)
- North America > United States (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Practical Evaluation of Copula-based Survival Metrics: Beyond the Independent Censoring Assumption
Lillelund, Christian Marius, Qi, Shi-ang, Greiner, Russell
Conventional survival metrics, such as Harrell's concordance index and the Brier Score, rely on the independent censoring assumption for valid inference in the presence of right-censored data. However, when instances are censored for reasons related to the event of interest, this assumption no longer holds, as this kind of dependent censoring biases the marginal survival estimates of popular nonparametric estimators. In this paper, we propose three copula-based metrics to evaluate survival models in the presence of dependent censoring, and design a framework to create realistic, semi-synthetic datasets with dependent censoring to facilitate the evaluation of the metrics. Our empirical analyses in synthetic and semi-synthetic datasets show that our metrics can give error estimates that are closer to the true error, mainly in terms of predictive accuracy.
- North America > United States (0.46)
- North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
- Europe > Denmark > Central Jutland > Aarhus (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
- Research Report > Strength Medium (0.67)
- Law > Civil Rights & Constitutional Law (1.00)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
Time-to-Event Pretraining for 3D Medical Imaging
Huo, Zepeng, Fries, Jason Alan, Lozano, Alejandro, Valanarasu, Jeya Maria Jose, Steinberg, Ethan, Blankemeier, Louis, Chaudhari, Akshay S., Langlotz, Curtis, Shah, Nigam H.
With the rise of medical foundation models and the growing availability of imaging data, scalable pretraining techniques offer a promising way to identify imaging biomarkers predictive of future disease risk. While current self-supervised methods for 3D medical imaging models capture local structural features like organ morphology, they fail to link pixel biomarkers with long-term health outcomes due to a missing context problem. Current approaches lack the temporal context necessary to identify biomarkers correlated with disease progression, as they rely on supervision derived only from images and concurrent text descriptions. To address this, we introduce time-to-event pretraining, a pretraining framework for 3D medical imaging models that leverages large-scale temporal supervision from paired, longitudinal electronic health records (EHRs). Using a dataset of 18,945 CT scans (4.2 million 2D images) and time-to-event distributions across thousands of EHR-derived tasks, our method improves outcome prediction, achieving an average AUROC increase of 23.7% and a 29.4% gain in Harrell's C-index across 8 benchmark tasks. Importantly, these gains are achieved without sacrificing diagnostic classification performance. This study lays the foundation for integrating longitudinal EHR and 3D imaging data to advance clinical risk prediction.
- Europe > Austria > Vienna (0.14)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > Middle East > Oman > Muscat Governorate > Muscat (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Health Care Technology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
A Large-Scale Neutral Comparison Study of Survival Models on Low-Dimensional Data
Burk, Lukas, Zobolas, John, Bischl, Bernd, Bender, Andreas, Wright, Marvin N., Sonabend, Raphael
This work presents the first large-scale neutral benchmark experiment focused on single-event, right-censored, low-dimensional survival data. Benchmark experiments are essential in methodological research to scientifically compare new and existing model classes through proper empirical evaluation. Existing benchmarks in the survival literature are often narrow in scope, focusing, for example, on high-dimensional data. Additionally, they may lack appropriate tuning or evaluation procedures, or are qualitative reviews, rather than quantitative comparisons. This comprehensive study aims to fill the gap by neutrally evaluating a broad range of methods and providing generalizable conclusions. We benchmark 18 models, ranging from classical statistical approaches to many common machine learning methods, on 32 publicly available datasets. The benchmark tunes for both a discrimination measure and a proper scoring rule to assess performance in different settings. Evaluating on 8 survival metrics, we assess discrimination, calibration, and overall predictive performance of the tested models. Using discrimination measures, we find that no method significantly outperforms the Cox model. However, (tuned) Accelerated Failure Time models were able to achieve significantly better results with respect to overall predictive performance as measured by the right-censored log-likelihood. Machine learning methods that performed comparably well include Oblique Random Survival Forests under discrimination, and Cox-based likelihood-boosting under overall predictive performance. We conclude that for predictive purposes in the standard survival analysis setting of low-dimensional, right-censored data, the Cox Proportional Hazards model remains a simple and robust method, sufficient for practitioners.
- Europe > Germany > Bremen > Bremen (0.14)
- North America > United States > Wyoming > Albany County > Laramie (0.14)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- (11 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Law > Civil Rights & Constitutional Law (0.76)
- Health & Medicine > Therapeutic Area > Hematology (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Optimal Survival Trees: A Dynamic Programming Approach
Huisman, Tim, van der Linden, Jacobus G. M., Demirović, Emir
Survival analysis studies and predicts the time of death, or other singular unrepeated events, based on historical data, while the true time of death for some instances is unknown. Survival trees enable the discovery of complex nonlinear relations in a compact human comprehensible model, by recursively splitting the population and predicting a distinct survival distribution in each leaf node. We use dynamic programming to provide the first survival tree method with optimality guarantees, enabling the assessment of the optimality gap of heuristics. We improve the scalability of our method through a special algorithm for computing trees up to depth two. The experiments show that our method's run time even outperforms some heuristics for realistic cases while obtaining similar out-of-sample performance with the state-of-the-art.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
- (4 more...)
Towards Clinical Prediction with Transparency: An Explainable AI Approach to Survival Modelling in Residential Aged Care
Background: Accurate survival time estimates aid end-of-life medical decision-making. Objectives: Develop an interpretable survival model for elderly residential aged care residents using advanced machine learning. Setting: A major Australasian residential aged care provider. Participants: Residents aged 65+ admitted for long-term care from July 2017 to August 2023. Sample size: 11,944 residents across 40 facilities. Predictors: Factors include age, gender, health status, co-morbidities, cognitive function, mood, nutrition, mobility, smoking, sleep, skin integrity, and continence. Outcome: Probability of survival post-admission, specifically calibrated for 6-month survival estimates. Statistical Analysis: Tested CoxPH, EN, RR, Lasso, GB, XGB, and RF models in 20 experiments with a 90/10 train/test split. Evaluated accuracy using C-index, Harrell's C-index, dynamic AUROC, IBS, and calibrated ROC. Chose XGB for its performance and calibrated it for 1, 3, 6, and 12-month predictions using Platt scaling. Employed SHAP values to analyze predictor impacts. Results: GB, XGB, and RF models showed the highest C-Index values (0.714, 0.712, 0.712). The optimal XGB model demonstrated a 6-month survival prediction AUROC of 0.746 (95% CI 0.744-0.749). Key mortality predictors include age, male gender, mobility, health status, pressure ulcer risk, and appetite. Conclusions: The study successfully applies machine learning to create a survival model for aged care, aligning with clinical insights on mortality risk factors and enhancing model interpretability and clinical utility through explainable AI.
- Oceania > New Zealand > North Island > Manawatū-Whanganui > Palmerston North (0.04)
- Oceania > Australia (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.84)
- Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.70)
Simulating discrimination in virtual reality
Have you ever been advised to "walk a mile in someone else's shoes?" Considering another person's perspective can be a challenging endeavor -- but recognizing our errors and biases is key to building understanding across communities. By challenging our preconceptions, we confront prejudice, such as racism and xenophobia, and potentially develop a more inclusive perspective about others. To assist with perspective-taking, MIT researchers have developed "On the Plane," a virtual reality role-playing game (VR RPG) that simulates discrimination. In this case, the game portrays xenophobia directed against a Malaysian America woman, but the approach can be generalized.
Embrace Artificial Intelligence in the Pharmacy
Technology has begun playing a larger role in community pharmacy and the power of artificial intelligence (AI) is poised to transform the world of pharmacy as we know it. During a presentation1 at the National Community Pharmacists Association 2022 Annual Convention, held October 1 through 4 in Kansas City, Missouri, Paige Clark, RPh, vice president of pharmacy programs and policy at Prescryptive Health discussed the advantages of incorporating AI into pharmacy practices and shared the results of a nationwide survey, conducted by Prescryptive Health, focused on the use of technology and AI in pharmacies. "Here's the promise of technology: 62% of decision makers say artificial intelligence will positively disrupt the pharmacy landscape within 1 to 2 years," Clark said. "Ninety-one percent of decision makers believe that technology can improve their pharmacies' profitability without question." In terms of pricing, AI is able to compare the prices of major chain pharmacies with local competitors and identify the granular details of pricing.