AITopics | Cushing

Collaborating Authors

Cushing

Performance of weakly-supervised electronic health record-based phenotyping methods in rare-outcome settings

Hong, Yunjing, Nelson, Jennifer C., Williamson, Brian D.

arXiv.org Machine LearningApr-14-2026

Accurately identifying patients with specific medical conditions is a key challenge when using clinical data from electronic health records. Our objective was to comprehensively assess when weakly-supervised prediction methods, which use silver-standard labels (proxy measures of the true outcome) rather than gold-standard true labels, perform well in rare-outcome settings like vaccine safety studies. We compared three methods (PheNorm, MAP, and sureLDA) that combine structured features and features derived from clinical text using natural language processing, through an extensive simulation study with data-generating mechanisms ranging from simple to complex, varying outcome rates, and varying degrees of informative silver labels. We also considered using predicted probabilities to design a chart review validation study. No single method dominated the other across all prediction performance metrics. Probability-guided sampling selected a cohort enriched for patients with more mentions of important concepts in chart notes. SureLDA, the most complex of the three algorithms we considered, often performed well in simulations. Performance depended greatly on selected tuning parameters. Care should be taken when using weakly-supervised prediction methods in rare-outcome settings, particularly if the probabilities will be used in downstream analysis, but these methods can work well when silver labels are strong predictors of true outcomes.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2604.09913

Country:

North America > United States > Oklahoma > Payne County > Cushing (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > Alaska (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
(2 more...)

Add feedback

Sequential Audit Sampling with Statistical Guarantees

Kato, Masahiro, Nakagawa, Kei

arXiv.org Machine LearningApr-8-2026

Financial statement auditing is conducted under a risk-based evidence approach to obtain reasonable assurance. In practice, auditors often perform additional sampling or related procedures when an initial sample does not provide a sufficient basis for a conclusion. Across jurisdictions, current standards and practice manuals acknowledge such extensions, while the statistical design of sequential audit procedures has not been fully explored. This study formulates audit sampling with additional, sequentially collected items as a sequential testing problem for a finite population under sampling without replacement. We define null and alternative hypotheses in terms of a tolerable deviation rate, specify stopping and decision rules, and formulate exact sequential boundary conditions in terms of finite-population error probabilities. For practical implementation, we calibrate those boundaries by Monte Carlo simulation at least-favorable deviation rates. The exact design yields ex ante control of decision error probabilities, and the simulation-based implementation approximates that design while allowing the computation of expected stopping times. The framework is most naturally suited to attribute auditing and deviation-rate auditing, especially tests of controls, and it can be extended to one-sided, two-stage, and truncated designs.

artificial intelligence, boundary, machine learning, (16 more...)

arXiv.org Machine Learning

2604.06116

Country:

North America > United States > Oklahoma > Payne County > Cushing (0.04)
Asia > Malaysia (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Government > Regional Government > North America Government > United States Government (0.95)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

My Tesla Was Driving Itself Perfectly--Until It Crashed

The Atlantic - TechnologyMar-17-2026, 11:00:00 GMT

This article was featured in the One Story to Read Today newsletter. T he smell was strange . The concrete wall was too close. One of my kids was standing on the sidewalk next to our car--not crying, just confused. The seat belt had held. The crumple zone had crumpled.

accident, artificial intelligence, tesla, (10 more...)

The Atlantic - Technology

Country:

North America > United States > Oklahoma > Payne County > Cushing (0.04)
North America > United States > New York (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)
North America > United States > Arizona > Maricopa County > Tempe (0.04)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Information Technology (1.00)
Automobiles & Trucks > Manufacturer (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)

Add feedback

b6b5f50a2001ad1cbccca96e693c4ab4-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-11-2026, 13:05:15 GMT

diagnosis, opensrh, representation, (16 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Oklahoma > Payne County > Cushing (0.04)
North America > United States > New York (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(2 more...)

Add feedback

Asymptotic Theory and Phase Transitions for Variable Importance in Quantile Regression Forests

Nakamura, Tomoshige, Shiraishi, Hiroshi

arXiv.org Machine LearningDec-1-2025

Quantile Regression Forests (QRF) are widely used for non-parametric conditional quantile estimation, yet statistical inference for variable importance measures remains challenging due to the non-smoothness of the loss function and the complex bias-variance trade-off. In this paper, we develop a asymptotic theory for variable importance defined as the difference in pinball loss risks. We first establish the asymptotic normality of the QRF estimator by handling the non-differentiable pinball loss via Knight's identity. Second, we uncover a "phase transition" phenomenon governed by the subsampling rate $β$ (where $s \asymp n^β$). We prove that in the bias-dominated regime ($β\ge 1/2$), which corresponds to large subsample sizes typically favored in practice to maximize predictive accuracy, standard inference breaks down as the estimator converges to a deterministic bias constant rather than a zero-mean normal distribution. Finally, we derive the explicit analytic form of this asymptotic bias and discuss the theoretical feasibility of restoring valid inference via analytic bias correction. Our results highlight a fundamental trade-off between predictive performance and inferential validity, providing a theoretical foundation for understanding the intrinsic limitations of random forest inference in high-dimensional settings.

convergence rate, estimator, variance, (11 more...)

arXiv.org Machine Learning

2511.23212

Country:

North America > United States > Oklahoma > Payne County > Cushing (0.04)
Asia > Pakistan (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Prunella Scales: From Fawlty Towers to Great Canal Journeys

BBC NewsOct-28-2025, 10:58:40 GMT

Prunella Scales, who died at the age of 93, was one of Britain's finest comic actors. But despite a long and distinguished career on stage and screen, she will inevitably be remembered as Sybil Fawlty in the 1970s TV comedy, Fawlty Towers. It was Sybil's mission in life to keep tabs on her stick insect husband Basil - played by John Cleese - between cigarette-fuelled phone conversations with her friend, Audrey. It fell to her to placate guests who had been shouted at, totally ignored or, in some cases, throttled by Basil when in one of his more manic moods. Her nightmarish laugh, gravity-defying hairdo and ferocious temper were part of a carefully constructed character that ranks as a comic masterpiece.

fawlty tower, great canal journey, prunella scale, (11 more...)

BBC News

Country:

South America (0.15)
North America > Central America (0.15)
Oceania > Australia (0.05)
(16 more...)

Genre: Personal > Obituary (0.50)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Government > Regional Government > Europe Government > United Kingdom Government (0.48)

Technology: Information Technology > Artificial Intelligence (0.48)

Add feedback

OpenSRH: optimizing brain tumor surgery using intraoperative stimulated Raman histology

Neural Information Processing SystemsAug-18-2025, 03:08:29 GMT

Accurate intraoperative diagnosis is essential for providing safe and effective care during brain tumor surgery.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Oklahoma > Payne County > Cushing (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(2 more...)

Add feedback

Evaluation of a Foundational Model and Stochastic Models for Forecasting Sporadic or Spiky Production Outages of High-Performance Machine Learning Services

Yim, Keun Soo

arXiv.org Artificial IntelligenceJul-3-2025

Time series forecasting models have diverse real world applications (e.g., from electricity metrics to software workload). Latest foundational models trained for time series forecasting show strengths (e.g., for long sequences and in zero-shot settings). However, foundational model was not yet used for forecasting rare, spiky events, i.e., a challenging target because those are a corner case of extreme events. In this paper, we optimize a state-of-the-art foundational model to forecast sporadic or spiky production outages of high-performance machine learning services powering billions of client devices. We evaluate the forecasting errors of the foundational model compared with classical stochastic forecasting models (e.g., moving average and autoregressive). The analysis helps us understand how each of the evaluated models performs for the sporadic or spiky events. For example, it identifies the key patterns in the target data that are well tracked by the foundational model vs. each of the stochastic models. We use the models with optimal parameters to estimate a year-long outage statistics of a particular root cause with less than 6% value errors.

data mining, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2507.01067

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
North America > United States > Tennessee (0.04)
North America > United States > Oklahoma > Payne County > Cushing (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Energy (1.00)
Education (1.00)
Banking & Finance > Trading (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

LitBench: A Benchmark and Dataset for Reliable Evaluation of Creative Writing

Fein, Daniel, Russo, Sebastian, Xiang, Violet, Jolly, Kabir, Rafailov, Rafael, Haber, Nick

arXiv.org Artificial IntelligenceJul-2-2025

Evaluating creative writing generated by large language models (LLMs) remains challenging because open-ended narratives lack ground truths. Without performant automated evaluation methods, off-the-shelf (OTS) language models are employed as zero-shot judges, yet their reliability is unclear in this context. In pursuit of robust evaluation for creative writing, we introduce LitBench, the first standardized benchmark and paired dataset for creative writing verification, comprising a held-out test set of 2,480 debiased, human-labeled story comparisons drawn from Reddit and a 43,827-pair training corpus of human preference labels. Using LitBench, we (i) benchmark zero-shot LLM judges, (ii) train Bradley Terry and generative reward models, and (iii) conduct an online human study to validate reward model rankings on newly LLM-generated stories. Our benchmark identifies Claude-3.7-Sonnet as the strongest off-the-shelf judge, reaching 73% agreement with human preferences; among trained reward models, Bradley-Terry and Generative reward models both attain an accuracy of 78%, outperforming all off-the-shelf judges. An online human study further confirms that our trained reward models consistently align with human preferences in novel LLM-generated stories. We release LitBench and reward models at https://huggingface.co/collections/SAA-Lab/litbench-68267b5da3aafe58f9e43461, providing a vetted resource for reliable, automated evaluation and optimization of creative writing systems.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2507.00769

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Oklahoma > Payne County > Cushing (0.04)
North America > United States > Michigan (0.04)
(4 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Using Machine Learning in Analyzing Air Quality Discrepancies of Environmental Impact

Wang, Shuangbao Paul, Yang, Lucas, Chouchane, Rahouane, Guo, Jin, Bailey, Michael

arXiv.org Artificial IntelligenceJun-24-2025

In this study, we apply machine learning and software engineering in analyzing air pollution levels in City of Baltimore. The data model was fed with three primary data sources: 1) a biased method of estimating insurance risk used by homeowners loan corporation, 2) demographics of Baltimore residents, and 3) census data estimate of NO2 and PM2.5 concentrations. The dataset covers 650,643 Baltimore residents in 44.7 million residents in 202 major cities in US. The results show that air pollution levels have a clear association with the biased insurance estimating method. Great disparities present in NO2 level between more desirable and low income blocks. Similar disparities exist in air pollution level between residents' ethnicity. As Baltimore population consists of a greater proportion of people of color, the finding reveals how decades old policies has continued to discriminate and affect quality of life of Baltimore citizens today.

artificial intelligence, machine learning, resident, (14 more...)

arXiv.org Artificial Intelligence

2506.17319

Country:

North America > United States > Maryland > Baltimore (0.05)
North America > United States > Oklahoma > Payne County > Cushing (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report > New Finding (0.54)

Industry: Law > Environmental Law (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback