symptom
Simulating Viva Voce Examinations to Evaluate Clinical Reasoning in Large Language Models
Clinical reasoning in medicine is a hypothesis-driven process where physicians refine diagnoses from limited information through targeted history, physical examination, and diagnostic investigations. In contrast, current medical benchmarks for large language models (LLMs) primarily assess knowledge recall through single-turn questions, where complete clinical information is provided upfront. To address this gap, we introduce VivaBench, a multi-turn benchmark that evaluates sequential clinical reasoning in LLM agents. Our dataset comprises 1152 physiciancurated clinical vignettes structured as interactive scenarios that simulate a viva voce examination in medical training, requiring agents to actively probe for relevant findings, select appropriate investigations, and synthesize information across multiple steps to reach a diagnosis. We evaluated several state-of-the-art LLMs and found that while models demonstrate competence in diagnosing conditions within well-described clinical presentations, their performance degrades significantly when required to navigate diagnostic uncertainty. Our analysis identified several failure modes that mirror common issues in clinical practice, including: (1) fixation on initial hypotheses, (2) excessive investigation ordering, (3) premature diagnostic closure, and (4) missing critical conditions. These patterns reveal fundamental limitations in how current LLMs manage uncertainty and gather information sequentially. Through VivaBench, we provide a standardized benchmark for evaluating conversational medical AI systems for real-world clinical decision support. Beyond medical applications, we contribute to the larger corpus of research on agentic AI by demonstrating how sequential reasoning trajectories can diverge in complex decision-making environments.
Unlearned but Not Forgotten: Data Extraction after Exact Unlearning in LLM
Large Language Models are typically trained on datasets collected from the web, which may inadvertently contain harmful or sensitive personal information. To address growing privacy concerns, unlearning methods have been proposed to remove the influence of specific data from trained models. Of these, exact unlearning-- which retrains the model from scratch without the target data--is widely regarded as the gold standard for mitigating privacy risks in deployment. In this paper, we revisit this assumption in a practical deployment setting where both the pre-and post-unlearning logits API are exposed, such as in open-weight scenarios. Targeting this setting, we introduce a novel data extraction attack that leverages signals from the pre-unlearning model to guide the post-unlearning model, uncovering patterns that reflect the removed data distribution. Combining model guidance with a token filtering strategy, our attack significantly improves extraction success rates-- doubling performance in some cases--across common benchmarks such as MUSE, TOFU, and WMDP. Furthermore, we demonstrate our attack's effectiveness on a simulated medical diagnosis dataset to highlight real-world privacy risks associated with exact unlearning. In light of our findings, which suggest that unlearning may, in a contradictory way, increase the risk of privacy leakage during realworld deployments, we advocate for evaluation of unlearning methods to consider broader threat models that account not only for post-unlearning models but also for adversarial access to prior checkpoints.
MoodAngels: ARetrieval-augmented Multi-agent Framework for Psychiatry Diagnosis
The application of AI in psychiatric diagnosis faces significant challenges, including the subjective nature of mental health assessments, symptom overlap across disorders, and privacy constraints limiting data availability. To address these issues, we present MoodAngels, the first specialized multi-agent framework for mood disorder diagnosis. Our approach combines granular-scale analysis of clinical assessments with a structured verification process, enabling more accurate interpretation of complex psychiatric data. Complementing this framework, we introduce MoodSyn, an open-source dataset of 1,173 synthetic psychiatric cases that preserves clinical validity while ensuring patient privacy. Experimental results demonstrate that MoodAngels outperforms conventional methods, with our baseline agent achieving 12.3% higher accuracy than GPT-4o on real-world cases, and our full multi-agent system delivering further improvements. Evaluation in the MoodSyn dataset demonstrates exceptional fidelity, accurately reproducing both the core statistical patterns and complex relationships present in the original data while maintaining strong utility for machine learning applications. Together, these contributions provide both an advanced diagnostic tool and a critical research resource for computational psychiatry, bridging important gaps in AI-assisted mental health assessment.
We Now Know How Many People the CDC Is Monitoring for Hantavirus
There are no confirmed cases in the US, but 41 people who were potentially exposed to the Andes virus are in quarantine or being monitored for symptoms. The US Centers for Disease Control and Prevention is monitoring 41 people in the US for the Andes hantavirus after a cruise ship was hit with a rare outbreak, but the risk to the public remains low, according to health officials. This includes a group of 18 passengers from the cruise ship who are now in quarantine facilities in Nebraska and Georgia. The agency is also monitoring passengers who returned home before the outbreak was identified and others who were exposed during travel, specifically on flights where a symptomatic case was present. "Most people under monitoring are considered high-risk exposures, and CDC recommends that everyone under monitoring stay at home and avoid being around people during their 42-day monitoring period," David Fitter, incident manager for the CDC's hantavirus response, told reporters during a media briefing on Thursday.
'I was given a choice - keep my legs or keep my life' - the sepsis patient who lived
'I was given a choice - keep my legs or keep my life' - the sepsis patient who lived Farmer Marshall Wylie thought nothing of it when he cut his arm, sorting wood in August 2023. And he thought even less of it when he felt ill over the next 48 hours. But the following week, he said he clinically died due to sepsis, and eventually his legs had to be amputated. Farmers are at particular risk of developing sepsis due to incidents on the farm, but can also be reluctant to seek healthcare. Warning: This article contains some graphic images of hands and feet with sepsis.
All Your Hantavirus Questions, Answered by an Infectious Disease Expert
Here's what you need to know, from why the cruise ship outbreak won't spark the next pandemic to how hantavirus spreads. Now that more than 100 passengers aboard a hantavirus -stricken luxury cruise ship have been evacuated, with 18 Americans in biocontainment units in Nebraska and Georgia, health officials around the world are working to monitor more than two dozen individuals who left the cruise and anyone with whom they might have come in close contact. So far, all of the 11 reported hantavirus cases are among passengers or crew on the ship, the World Health Organization's director-general Tedros Adhanom Ghebreyesus said at a press conference in Madrid on Tuesday. That includes three deaths resulting from the virus. Typically, hantaviruses are spread when contaminated rodent droppings and urine are stirred up in the air and breathed in.
Can you overdose on cough drops? Short answer: Yes.
It'd take a lot of them, though. More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. Menthol soothes in small doses, but too much can irritate your body--and in rare cases, cause serious symptoms. Breakthroughs, discoveries, and DIY tips sent six days a week. We all know the feeling--a throbbing in your throat that won't go away.
ECG Question Answering Combined With Electrocardiogram
Question answering (QA) in the field of healthcare has received much attention due to significant advancements in natural language processing. However, existing healthcare QA datasets primarily focus on medical images, clinical notes, or structured electronic health record tables. This leaves the vast potential of combining electrocardiogram (ECG) data with these systems largely untapped. To address this gap, we present ECG-QA, the first QA dataset specifically designed for ECG analysis. The dataset comprises a total of 70 question templates that cover a wide range of clinically relevant ECG topics, each validated by an ECG expert to ensure their clinical utility. As a result, our dataset includes diverse ECG interpretation questions, including those that require a comparative analysis of two different ECGs. In addition, we have conducted numerous experiments to provide valuable insights for future research directions. We believe that ECG-QA will serve as a valuable resource for the development of intelligent QA systems capable of assisting clinicians in ECG interpretations.
MCAnalysis: An Open-Source Package for Preprocessing, Modelling, and Visualisation of Menstrual Cycle Effects in Digital Health Data
Delray, Kyra, Lewis, Glyn, Grace, Bola, Hayes, Joseph, Evans, Robin
Digital Health Technologies (DHTs) including consumer wearable devices and digital health applications offer an opportunity for continuous, large-scale data collection. Wearables give insight into physiological biomarkers that help us understand the human body, through passive data collection. Such data can be collected at a regularity that would be impossible otherwise. Digital health applications provide the chance to collect diverse types of data from clinically validated surveys, GPS, and contextual inputs. This combination has the ability to make profound advances in our understanding of the factors that affect individuals on a personal and population level [Grace et al., 2025]. One of these factors is the menstrual cycle. Particularly because of its inter-individual variability, studying it requires large sample sizes, and to truly grasp its effects on the human body, it needs to be observed on a near-daily scale [Bull et al., 2019].
Enhancing Online Support Group Formation Using Topic Modeling Techniques
Barman, Pronob Kumar, Reynolds, Tera L., Foulds, James
Online health communities (OHCs) are vital for fostering peer support and improving health outcomes. Support groups within these platforms can provide more personalized and cohesive peer support, yet traditional support group formation methods face challenges related to scalability, static categorization, and insufficient personalization. To overcome these limitations, we propose two novel machine learning models for automated support group formation: the Group specific Dirichlet Multinomial Regression (gDMR) and the Group specific Structured Topic Model (gSTM). These models integrate user generated textual content, demographic profiles, and interaction data represented through node embeddings derived from user networks to systematically automate personalized, semantically coherent support group formation. We evaluate the models on a large scale dataset from MedHelp, comprising over 2 million user posts. Both models substantially outperform baseline methods including LDA, DMR, and STM in predictive accuracy (held out log likelihood), semantic coherence (UMass metric), and internal group consistency. The gDMR model yields group covariates that facilitate practical implementation by leveraging relational patterns from network structures and demographic data. In contrast, gSTM emphasizes sparsity constraints to generate more distinct and thematically specific groups. Qualitative analysis further validates the alignment between model generated groups and manually coded themes, showing the practical relevance of the models in informing groups that address diverse health concerns such as chronic illness management, diagnostic uncertainty, and mental health. By reducing reliance on manual curation, these frameworks provide scalable solutions that enhance peer interactions within OHCs, with implications for patient engagement, community resilience, and health outcomes.