auroc 0
Detecting and Steering LLMs' Empathy in Action
We investigate empathy-in-action -- the willingness to sacrifice task efficiency to address human needs -- as a linear direction in LLM activation space. Using contrastive prompts grounded in the Empathy-in-Action (EIA) benchmark, we test detection and steering across Phi-3-mini-4k (3.8B), Qwen2.5-7B (safety-trained), and Dolphin-Llama-3.1-8B (uncensored). Detection: All models show AUROC 0.996-1.00 at optimal layers. Uncensored Dolphin matches safety-trained models, demonstrating empathy encoding emerges independent of safety training. Phi-3 probes correlate strongly with EIA behavioral scores (r=0.71, p<0.01). Cross-model probe agreement is limited (Qwen: r=-0.06, Dolphin: r=0.18), revealing architecture-specific implementations despite convergent detection. Steering: Qwen achieves 65.3% success with bidirectional control and coherence at extreme interventions. Phi-3 shows 61.7% success with similar coherence. Dolphin exhibits asymmetric steerability: 94.4% success for pro-empathy steering but catastrophic breakdown for anti-empathy (empty outputs, code artifacts). Implications: The detection-steering gap varies by model. Qwen and Phi-3 maintain bidirectional coherence; Dolphin shows robustness only for empathy enhancement. Safety training may affect steering robustness rather than preventing manipulation, though validation across more models is needed.
CEDL: Centre-Enhanced Discriminative Learning for Anomaly Detection
Darban, Zahra Zamanzadeh, Wang, Qizhou, Aggarwal, Charu C., Webb, Geoffrey I., Abbasnejad, Ehsan, Salehi, Mahsa
Supervised anomaly detection methods perform well in identifying known anomalies that are well represented in the training set. However, they often struggle to generalise beyond the training distribution due to decision boundaries that lack a clear definition of normality. Existing approaches typically address this by regularising the representation space during training, leading to separate optimisation in latent and label spaces. The learned normality is therefore not directly utilised at inference, and their anomaly scores often fall within arbitrary ranges that require explicit mapping or calibration for probabilistic interpretation. To achieve unified learning of geometric normality and label discrimination, we propose Centre-Enhanced Discriminative Learning (CEDL), a novel supervised anomaly detection framework that embeds geometric normality directly into the discriminative objective. CEDL reparameterises the conventional sigmoid-derived prediction logit through a centre-based radial distance function, unifying geometric and discriminative learning in a single end-to-end formulation. This design enables interpretable, geometry-aware anomaly scoring without post-hoc thresholding or reference calibration. Extensive experiments on tabular, time-series, and image data demonstrate that CEDL achieves competitive and balanced performance across diverse real-world anomaly detection tasks, validating its effectiveness and broad applicability.
- North America > United States (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- Oceania > Australia > Victoria > Melbourne (0.04)
- Information Technology (0.46)
- Health & Medicine (0.46)
Lift What You Can: Green Online Learning with Heterogeneous Ensembles
Köbschall, Kirsten, Buschjäger, Sebastian, Fischer, Raphael, Hartung, Lisa, Kramer, Stefan
Ensemble methods for stream mining necessitate managing multiple models and updating them as data distributions evolve. Considering the calls for more sustainability, established methods are however not sufficiently considerate of ensemble members' computational expenses and instead overly focus on predictive capabilities. To address these challenges and enable green online learning, we propose heterogeneous online ensembles (HEROS). For every training step, HEROS chooses a subset of models from a pool of models initialized with diverse hyperparameter choices under resource constraints to train. We introduce a Markov decision process to theoretically capture the trade-offs between predictive performance and sustainability constraints. Based on this framework, we present different policies for choosing which models to train on incoming data. Most notably, we propose the novel $ζ$-policy, which focuses on training near-optimal models at reduced costs. Using a stochastic model, we theoretically prove that our $ζ$-policy achieves near optimal performance while using fewer resources compared to the best performing policy. In our experiments across 11 benchmark datasets, we find empiric evidence that our $ζ$-policy is a strong contribution to the state-of-the-art, demonstrating highly accurate performance, in some cases even outperforming competitors, and simultaneously being much more resource-friendly.
- Europe > Germany > Rheinland-Pfalz > Mainz (0.04)
- Europe > Switzerland (0.04)
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- (14 more...)
- Education > Educational Setting > Online (1.00)
- Government (0.93)
Reasoning's Razor: Reasoning Improves Accuracy but Can Hurt Recall at Critical Operating Points in Safety and Hallucination Detection
Chegini, Atoosa, Kazemi, Hamid, Souza, Garrett, Safi, Maria, Song, Yang, Bengio, Samy, Williamson, Sinead, Farajtabar, Mehrdad
In precision-sensitive classification tasks, false positives carry severe operational consequences. For example, when a text safety classifier incorrectly flags 10% of benign user queries as unsafe, it blocks legitimate queries from being processed, degrading the experience for millions of users and potentially driving them away from the service. Similarly, in hallucination detection within Retrieval-Augmented Generation (RAG) pipelines, when factually correct responses are incorrectly flagged as hallucinated, the system triggers regeneration or self-correction mechanisms, adding unnecessary computational overhead and latency that frustrates users waiting for responses. These deployment realities demand classifiers that operate at extremely low false positive rates--often below 1%--while maintaining acceptable recall. Large language models are increasingly deployed for such precision-critical classification tasks through specialized safety guardrails like Llama Guard (Inan et al., 2023) and ShieldGemma (Zeng et al., 2024), as well as hallucination detection systems (Huang et al., 2025). Recently, reasoning-augmented approaches have emerged as a promising direction: GuardReasoner (Liu et al., 2025) incorporates chain-of-thought reasoning for safety classification, while Lynx (Ravi et al., 2024) leverages reasoning for hallucination detection in RAG
- North America > United States > Maryland > Prince George's County > College Park (0.04)
- Asia > Singapore (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.90)
Can Molecular Foundation Models Know What They Don't Know? A Simple Remedy with Preference Optimization
He, Langzhou, Zhu, Junyou, Wang, Fangxin, Liu, Junhua, Xu, Haoyan, Zhao, Yue, Yu, Philip S., Wu, Qitian
Molecular foundation models are rapidly advancing scientific discovery, but their unreliability on out-of-distribution (OOD) samples severely limits their application in high-stakes domains such as drug discovery and protein design. A critical failure mode is chemical hallucination, where models make high-confidence yet entirely incorrect predictions for unknown molecules. To address this challenge, we introduce Molecular Preference-Aligned Instance Ranking (Mole-PAIR), a simple, plug-and-play module that can be flexibly integrated with existing foundation models to improve their reliability on OOD data through cost-effective post-training. Specifically, our method formulates the OOD detection problem as a preference optimization over the estimated OOD affinity between in-distribution (ID) and OOD samples, achieving this goal through a pairwise learning objective. We show that this objective essentially optimizes AUROC, which measures how consistently ID and OOD samples are ranked by the model. Extensive experiments across five real-world molecular datasets demonstrate that our approach significantly improves the OOD detection capabilities of existing molecular foundation models, achieving up to 45.8%, 43.9%, and 24.3% improvements in AUROC under distribution shifts of size, scaffold, and assay, respectively.
- North America > United States > California (0.14)
- Europe > Germany > Brandenburg > Potsdam (0.04)
- Asia > China > Liaoning Province > Shenyang (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
Robult: Leveraging Redundancy and Modality Specific Features for Robust Multimodal Learning
Nguyen, Duy A., Kamboj, Abhi, Do, Minh N.
Addressing missing modalities and limited labeled data is crucial for advancing robust multimodal learning. We propose Robult, a scalable framework designed to mitigate these challenges by preserving modality-specific information and leveraging redundancy through a novel information-theoretic approach. Robult optimizes two core objectives: (1) a soft Positive-Unlabeled (PU) contrastive loss that maximizes task-relevant feature alignment while effectively utilizing limited labeled data in semi-supervised settings, and (2) a latent reconstruction loss that ensures unique modality-specific information is retained. These strategies, embedded within a modular design, enhance performance across various downstream tasks and ensure resilience to incomplete modalities during inference. Experimental results across diverse datasets validate that Robult achieves superior performance over existing approaches in both semi-supervised learning and missing modality contexts. Furthermore, its lightweight design promotes scalability and seamless integration with existing architectures, making it suitable for real-world multimodal applications.
- North America > United States > Illinois (0.04)
- Asia > Vietnam > Hanoi > Hanoi (0.04)
Generalizable Diabetes Risk Stratification via Hybrid Machine Learning Models
Parvez, Athar, Mufti, Muhammad Jawad
Background/Purpose: Diabetes affects over 537 million people worldwide and is projected to reach 783 million by 2045. Early risk stratification can benefit from machine learning. We compare two hybrid classifiers and assess their generalizability on an external cohort. Methods: Two hybrids were built: (i) XGBoost + Random Forest (XGB-RF) and (ii) Support Vector Machine + Logistic Regression (SVM-LR). A leakage-safe, standardized pipeline (encoding, imputation, min-max scaling; SMOTE on training folds only; probability calibration for SVM) was fit on the primary dataset and frozen. Evaluation prioritized threshold-independent discrimination (AUROC/AUPRC) and calibration (Brier, slope/intercept). External validation used the PIMA cohort (N=768) with the frozen pipeline; any thresholded metrics on PIMA were computed at the default rule tau = 0.5. Results: On the primary dataset (PR baseline = 0.50), XGB-RF achieved AUROC ~0.995 and AUPRC ~0.998, outperforming SVM-LR (AUROC ~0.978; AUPRC ~0.947). On PIMA (PR baseline ~0.349), XGB-RF retained strong performance (AUROC ~0.990; AUPRC ~0.959); SVM-LR was lower (AUROC ~0.963; AUPRC ~0.875). Thresholded metrics on PIMA at tau = 0.5 were XGB-RF (Accuracy 0.960; Precision 0.941; Recall 0.944; F1 0.942) and SVM-LR (Accuracy 0.900; Precision 0.855; Recall 0.858; F1 0.857). Conclusions: Across internal and external cohorts, XGB-RF consistently dominated SVM-LR and exhibited smaller external attenuation on ROC/PR with acceptable calibration. These results support gradient-boosting-based hybridization as a robust, transferable approach for diabetes risk stratification and motivate prospective, multi-site validation with deployment-time threshold selection based on clinical trade-offs.
- Asia > Middle East > Saudi Arabia > Eastern Province > Dhahran (0.14)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Asia > India (0.04)
- Research Report > New Finding (0.89)
- Research Report > Experimental Study (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)
FracAug: Fractional Augmentation boost Graph-level Anomaly Detection under Limited Supervision
Dong, Xiangyu, Zhang, Xingyi, Wang, Sibo
Graph-level anomaly detection (GAD) is critical in diverse domains such as drug discovery, yet high labeling costs and dataset imbalance hamper the performance of Graph Neural Networks (GNNs). To address these issues, we propose FracAug, an innovative plug-in augmentation framework that enhances GNNs by generating semantically consistent graph variants and pseudo-labeling with mutual verification. Unlike previous heuristic methods, FracAug learns semantics within given graphs and synthesizes fractional variants, guided by a novel weighted distance-aware margin loss. This captures multi-scale topology to generate diverse, semantic-preserving graphs unaffected by data imbalance. Then, FracAug utilizes predictions from both original and augmented graphs to pseudo-label unlabeled data, iteratively expanding the training set. As a model-agnostic module compatible with various GNNs, FracAug demonstrates remarkable universality and efficacy: experiments across 14 GNNs on 12 real-world datasets show consistent gains, boosting average AUROC, AUPRC, and F1-score by up to 5.72%, 7.23%, and 4.18%, respectively.
Early Prediction of In-Hospital ICU Mortality Using Innovative First-Day Data: A Review
Huang, Baozhu, Chen, Cheng, Hou, Xuanhe, Huang, Junmin, Wei, Zihan, Luo, Hongying, Chen, Lu, Xu, Yongzhi, Luo, Hejiao, Qin, Changqi, Bi, Ziqian, Song, Junhao, Wang, Tianyang, Liang, ChiaXin, Yu, Zizhong, Wang, Han, Sun, Xiaotian, Hao, Junfeng, Tian, Chunjie
The intensive care unit (ICU) manages critically ill patients, many of whom face a high risk of mortality. Early and accurate prediction of in-hospital mortality within the first 24 hours of ICU admission is crucial for timely clinical interventions, resource optimization, and improved patient outcomes. Traditional scoring systems, while useful, often have limitations in predictive accuracy and adaptability. Objective: This review aims to systematically evaluate and benchmark innovative methodologies that leverage data available within the first day of ICU admission for predicting in-hospital mortality. We focus on advancements in machine learning, novel biomarker applications, and the integration of diverse data types.
- Europe > Netherlands > Limburg > Maastricht (0.04)
- Europe > Austria > Salzburg > Salzburg (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (4 more...)
- Overview (1.00)
- Research Report > Experimental Study (0.69)
- Research Report > Promising Solution (0.69)
Automated Triaging and Transfer Learning of Incident Learning Safety Reports Using Large Language Representational Models
Beidler, Peter, Nguyen, Mark, Lybarger, Kevin, Holmberg, Ola, Ford, Eric, Kang, John
PURPOSE: Incident reports are an important tool for safety and quality improvement in healthcare, but manual review is time-consuming and requires subject matter expertise. Here we present a natural language processing (NLP) screening tool to detect high-severity incident reports in radiation oncology across two institutions. METHODS AND MATERIALS: We used two text datasets to train and evaluate our NLP models: 7,094 reports from our institution (Inst.), and 571 from IAEA SAFRON (SF), all of which had severity scores labeled by clinical content experts. We trained and evaluated two types of models: baseline support vector machines (SVM) and BlueBERT which is a large language model pretrained on PubMed abstracts and hospitalized patient data. We assessed for generalizability of our model in two ways. First, we evaluated models trained using Inst.-train on SF-test. Second, we trained a BlueBERT_TRANSFER model that was first fine-tuned on Inst.-train then on SF-train before testing on SF-test set. To further analyze model performance, we also examined a subset of 59 reports from our Inst. dataset, which were manually edited for clarity. RESULTS Classification performance on the Inst. test achieved AUROC 0.82 using SVM and 0.81 using BlueBERT. Without cross-institution transfer learning, performance on the SF test was limited to an AUROC of 0.42 using SVM and 0.56 using BlueBERT. BlueBERT_TRANSFER, which was fine-tuned on both datasets, improved the performance on SF test to AUROC 0.78. Performance of SVM, and BlueBERT_TRANSFER models on the manually curated Inst. reports (AUROC 0.85 and 0.74) was similar to human performance (AUROC 0.81). CONCLUSION: In summary, we successfully developed cross-institution NLP models on incident report text from radiation oncology centers. These models were able to detect high-severity reports similarly to humans on a curated dataset.
- Oceania > Australia (0.04)
- North America > United States > Virginia (0.04)
- North America > Canada (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.94)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Nuclear Medicine (1.00)
- Energy > Power Industry > Utilities > Nuclear (0.34)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)