iqr
Towards Optimal Valve Prescription for Transcatheter Aortic Valve Replacement (TAVR) Surgery: A Machine Learning Approach
Paschalidis, Phevos, Stoumpou, Vasiliki, Everest, Lisa, Ma, Yu, Azemi, Talhat, Haider, Jawad, Zweibel, Steven, Protopapas, Eleftherios M., Mather, Jeff, Tysarowski, Maciej, Sarris, George E., Hagberg, Robert C., Haronian, Howard L., Bertsimas, Dimitris
Transcatheter Aortic Valve Replacement (TAVR) has emerged as a minimally invasive treatment option for patients with severe aortic stenosis, a life-threatening cardiovascular condition. Multiple transcatheter heart valves (THV) have been approved for use in TAVR, but current guidelines regarding valve type prescription remain an active topic of debate. We propose a data-driven clinical support tool to identify the optimal valve type with the objective of minimizing the risk of permanent pacemaker implantation (PPI), a predominant postoperative complication. We synthesize a novel dataset that combines U.S. and Greek patient populations and integrates three distinct data sources (patient demographics, computed tomography scans, echocardiograms) while harmonizing differences in each country's record system. We introduce a leaf-level analysis to leverage population heterogeneity and avoid benchmarking against uncertain counterfactual risk estimates. The final prescriptive model shows a reduction in PPI rates of 26% and 16% compared with the current standard of care in our internal U.S. population and external Greek validation cohort, respectively. To the best of our knowledge, this work represents the first unified, personalized prescription strategy for THV selection in TAVR.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > Connecticut > Hartford County > Hartford (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
From Narratives to Probabilistic Reasoning: Predicting and Interpreting Drivers' Hazardous Actions in Crashes Using Large Language Model
Chen, Boyou, Xu, Gerui, Wang, Zifei, Guo, Huizhong, Ahmed, Ananna, Sun, Zhaonan, Hu, Zhen, Zhang, Kaihan, Bao, Shan
Vehicle crashes involve complex interactions between road users, split-second decisions, and challenging environmental conditions. Among these, two-vehicle crashes are the most prevalent, accounting for approximately 70% of roadway crashes and posing a significant challenge to traffic safety. Identifying Driver Hazardous Action (DHA) is essential for understanding crash causation, yet the reliability of DHA data in large-scale databases is limited by inconsistent and labor-intensive manual coding practices. Here, we present an innovative framework that leverages a fine-tuned large language model to automatically infer DHAs from textual crash narratives, thereby improving the validity and interpretability of DHA classifications. Using five years of two-vehicle crash data from MTCF, we fine-tuned the Llama 3.2 1B model on detailed crash narratives and benchmarked its performance against conventional machine learning classifiers, including Random Forest, XGBoost, CatBoost, and a neural network. The fine-tuned LLM achieved an overall accuracy of 80%, surpassing all baseline models and demonstrating pronounced improvements in scenarios with imbalanced data. To increase interpretability, we developed a probabilistic reasoning approach, analyzing model output shifts across original test sets and three targeted counterfactual scenarios: variations in driver distraction and age. Our analysis revealed that introducing distraction for one driver substantially increased the likelihood of "General Unsafe Driving"; distraction for both drivers maximized the probability of "Both Drivers Took Hazardous Actions"; and assigning a teen driver markedly elevated the probability of "Speed and Stopping Violations." Our framework and analytical methods provide a robust and interpretable solution for large-scale automated DHA detection, offering new opportunities for traffic safety analysis and intervention.
- North America > United States > Michigan > Wayne County > Dearborn (0.15)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- Asia > Middle East > Jordan (0.04)
- (4 more...)
- Transportation > Ground > Road (1.00)
- Health & Medicine (1.00)
- Automobiles & Trucks (1.00)
Uncertainty-Guided Model Selection for Tabular Foundation Models in Biomolecule Efficacy Prediction
Li, Jie, McCarthy, Andrew, Zhang, Zhizhuo, Young, Stephen
In-context learners like TabPFN are promising for biomolecule efficacy prediction, where established molecular feature sets and relevant experimental results can serve as powerful contextual examples. However, their performance is highly sensitive to the provided context, making strategies like post-hoc ensembling of models trained on different data subsets a viable approach. An open question is how to select the best models for the ensemble without access to ground truth labels. In this study, we investigate an uncertainty-guided strategy for model selection. We demonstrate on an siRNA knockdown efficacy task that a TabPFN model using straightforward sequence-based features can surpass specialized state-of-the-art predictors. We also show that the model's predicted inter-quantile range (IQR), a measure of its uncertainty, has a negative correlation with true prediction error. We developed the OligoICP method, which selects and averages an ensemble of models with the lowest mean IQR for siRNA efficacy prediction, achieving superior performance compared to naive ensembling or using a single model trained on all available data. This finding highlights model uncertainty as a powerful, label-free heuristic for optimizing biomolecule efficacy predictions.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > California > San Mateo County > South San Francisco (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
A Compute resources used
Table 3 shows the full results with unlikelihood training and length normalization.COP A H-Swag StoryCloze Winogrande WSC WiC FT 78 .0 PEFT methods we considered and ablate the losses. We use "Question:" and "Answer:" as Since T0 is unable to perform ICL on its own, we also compare to T5+LM, the next-step-prediction language model upon which T0 is based. Due to memory constraints and because of its improved performance, we use ensemble ICL for Table table 10 shows the T-Few ablation results. Per-dataset results of T-Few and the other top-5 methods on RAFT are shown in table 11. 18 # of Param COP A H-Swag StoryCloze WinograndeFull Model Fine-tuning 3B 81 .0
Performance Characterization of a Point-Cloud-Based Path Planner in Off-Road Terrain
Majhor, Casey D., Bos, Jeremy P.
We present a comprehensive evaluation of a point-cloud-based navigation stack, MUONS, for autonomous off-road navigation. Performance is characterized by analyzing the results of 30,000 planning and navigation trials in simulation and validated through field testing. Our simulation campaign considers three kinematically challenging terrain maps and twenty combinations of seven path-planning parameters. In simulation, our MUONS-equipped AGV achieved a 0.98 success rate and experienced no failures in the field. By statistical and correlation analysis we determined that the Bi-RRT expansion radius used in the initial planning stages is most correlated with performance in terms of planning time and traversed path length. Finally, we observed that the proportional variation due to changes in the tuning parameters is remarkably well correlated to performance in field testing. This finding supports the use of Monte-Carlo simulation campaigns for performance assessment and parameter tuning.
- North America > United States > Michigan > Macomb County > Warren (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Military (1.00)
- Information Technology > Services (0.70)
- Transportation > Ground > Road (0.67)
Towards Synthesizing Normative Data for Cognitive Assessments Using Generative Multimodal Large Language Models
Yan, Victoria, Chotkowski, Honor, Wang, Fengran, Li, Xinhui, Yang, Carl, Lu, Jiaying, Yan, Runze, Hu, Xiao, Fedorov, Alex
Cognitive assessments require normative data as essential benchmarks for evaluating individual performance. Hence, developing new cognitive tests based on novel image stimuli is challenging due to the lack of readily available normative data. Traditional data collection methods are costly, time-consuming, and infrequently updated, limiting their practical utility. Recent advancements in generative multimodal large language models (MLLMs) offer a new approach to generate synthetic normative data from existing cognitive test images. We investigated the feasibility of using MLLMs, specifically GPT-4o and GPT-4o-mini, to synthesize normative textual responses for established image-based cognitive assessments, such as the "Cookie Theft" picture description task. Two distinct prompting strategies-naive prompts with basic instructions and advanced prompts enriched with contextual guidance-were evaluated. Responses were analyzed using embeddings to assess their capacity to distinguish diagnostic groups and demographic variations. Performance metrics included BLEU, ROUGE, BERTScore, and an LLM-as-a-judge evaluation. Advanced prompting strategies produced synthetic responses that more effectively distinguished between diagnostic groups and captured demographic diversity compared to naive prompts. Superior models generated responses exhibiting higher realism and diversity. BERTScore emerged as the most reliable metric for contextual similarity assessment, while BLEU was less effective for evaluating creative outputs. The LLM-as-a-judge approach provided promising preliminary validation results. Our study demonstrates that generative multimodal LLMs, guided by refined prompting methods, can feasibly generate robust synthetic normative data for existing cognitive tests, thereby laying the groundwork for developing novel image-based cognitive assessments without the traditional limitations.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Asia > Middle East > Jordan (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
- Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.49)
Deep Active Learning for Lung Disease Severity Classification from Chest X-rays: Learning with Less Data in the Presence of Class Imbalance
Gabriel, Roy M., Zandehshahvar, Mohammadreza, van Assen, Marly, Kittisut, Nattakorn, Peters, Kyle, De Cecco, Carlo N., Adibi, Ali
To reduce the amount of required labeled data for lung disease severity classification from chest X-rays (CXRs) under class imbalance, this study applied deep active learning with a Bayesian Neural Network (BNN) approximation and weighted loss function. This retrospective study collected 2,319 CXRs from 963 patients (mean age, 59.2 $\pm$ 16.6 years; 481 female) at Emory Healthcare affiliated hospitals between January and November 2020. All patients had clinically confirmed COVID-19. Each CXR was independently labeled by 3 to 6 board-certified radiologists as normal, moderate, or severe. A deep neural network with Monte Carlo Dropout was trained using active learning to classify disease severity. Various acquisition functions were used to iteratively select the most informative samples from an unlabeled pool. Performance was evaluated using accuracy, area under the receiver operating characteristic curve (AU ROC), and area under the precision-recall curve (AU PRC). Training time and acquisition time were recorded. Statistical analysis included descriptive metrics and performance comparisons across acquisition strategies. Entropy Sampling achieved 93.7% accuracy (AU ROC, 0.91) in binary classification (normal vs. diseased) using 15.4% of the training data. In the multi-class setting, Mean STD sampling achieved 70.3% accuracy (AU ROC, 0.86) using 23.1% of the labeled data. These methods outperformed more complex and computationally expensive acquisition functions and significantly reduced labeling needs. Deep active learning with BNN approximation and weighted loss effectively reduces labeled data requirements while addressing class imbalance, maintaining or exceeding diagnostic performance.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Netherlands > Drenthe > Assen (0.05)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- (2 more...)
- Health & Medicine > Nuclear Medicine (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.84)
Evaluating Time Series Models for Urban Wastewater Management: Predictive Performance, Model Complexity and Resilience
Singh, Vipin, Ling, Tianheng, Chiaburu, Teodor, Biessmann, Felix
Climate change increases the frequency of extreme rainfall, placing a significant strain on urban infrastructures, especially Combined Sewer Systems (CSS). Overflows from overburdened CSS release untreated wastewater into surface waters, posing environmental and public health risks. Although traditional physics-based models are effective, they are costly to maintain and difficult to adapt to evolving system dynamics. Machine Learning (ML) approaches offer cost-efficient alternatives with greater adaptability. To systematically assess the potential of ML for modeling urban infrastructure systems, we propose a protocol for evaluating Neural Network architectures for CSS time series forecasting with respect to predictive performance, model complexity, and robustness to perturbations. In addition, we assess model performance on peak events and critical fluctuations, as these are the key regimes for urban wastewater management. To investigate the feasibility of lightweight models suitable for IoT deployment, we compare global models, which have access to all information, with local models, which rely solely on nearby sensor readings. Additionally, to explore the security risks posed by network outages or adversarial attacks on urban infrastructure, we introduce error models that assess the resilience of models. Our results demonstrate that while global models achieve higher predictive performance, local models provide sufficient resilience in decentralized scenarios, ensuring robust modeling of urban infrastructure. Furthermore, models with longer native forecast horizons exhibit greater robustness to data perturbations. These findings contribute to the development of interpretable and reliable ML solutions for sustainable urban wastewater management. The implementation is available in our GitHub repository.
- Europe > Germany > Berlin (0.04)
- North America > United States > District of Columbia > Washington (0.04)
Interaction Techniques that Encourage Longer Prompts Can Improve Psychological Ownership when Writing with AI
Writing longer prompts for an AI assistant to generate a short story increases psychological ownership, a user's feeling that the writing belongs to them. To encourage users to write longer prompts, we evaluated two interaction techniques that modify the prompt entry interface of chat-based generative AI assistants: pressing and holding the prompt submission button, and continuously moving a slider up and down when submitting a short prompt. A within-subjects experiment investigated the effects of such techniques on prompt length and psychological ownership, and results showed that these techniques increased prompt length and led to higher psychological ownership than baseline techniques. A second experiment further augmented these techniques by showing AI-generated suggestions for how the prompts could be expanded. This further increased prompt length, but did not lead to improvements in psychological ownership. Our results show that simple interface modifications like these can elicit more writing from users and improve psychological ownership.
- North America > Canada > Ontario > Waterloo Region > Waterloo (0.14)
- Europe > United Kingdom > Scotland > City of Glasgow > Glasgow (0.14)
- North America > United States > New York > New York County > New York City (0.06)
- (12 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)