Adams County
- North America > United States > Colorado > Adams County > Commerce City (0.15)
- North America > United States > Virginia (0.04)
- North America > United States > Utah (0.04)
- (4 more...)
- Media > News (1.00)
- Government > Regional Government > North America Government > United States Government (0.72)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.69)
- North America > United States > Colorado > Adams County > Aurora (0.15)
- North America > United States > Illinois > Cook County > Chicago (0.05)
- North America > Canada (0.05)
- North America > United States > Iowa (0.05)
- Media (1.00)
- Leisure & Entertainment > Sports (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- (3 more...)
- North America > United States > Texas > Taylor County > Abilene (0.05)
- North America > United States > Illinois > Cook County > Chicago (0.05)
- North America > United States > Colorado > Adams County > Westminster (0.05)
- (2 more...)
- Leisure & Entertainment > Sports (1.00)
- Law (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- (3 more...)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.54)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.54)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)
Large language models management of medications: three performance analyses
Henry, Kelli, Xu, Steven, Blotske, Kaitlin, Cargile, Moriah, Barreto, Erin F., Murray, Brian, Smith, Susan, Bauer, Seth R., Zhao, Xingmeng, Tilley, Adeleine, Gao, Yanjun, Liu, Tianming, Sohn, Sunghwan, Sikora, Andrea
Purpose: Large language models (LLMs) have proven performance for certain diagnostic tasks, however limited studies have evaluated their consistency in recommending appropriate medication regimens for a given diagnosis. Medication management is a complex task that requires synthesis of drug formulation and complete order instructions for safe use. Here, the performance of GPT 4o, an LLM available with ChatGPT, was tested for three medication management tasks. Methods: GPT-4o performance was tested using three medication tasks: identifying available formulations for a given generic drug name, identifying drug-drug interactions (DDI) for a given medication regimen, and preparing a medication order for a given generic drug name. For each experiment, the models raw text response was captured exactly as returned and evaluated using clinician evaluation in addition to standard LLM metrics, including Term Frequency-Inverse Document Frequency (TF IDF) vectors, normalized Levenshtein similarity, and Recall-Oriented Understudy for Gisting Evaluation (ROUGE 1/ROUGE L F1) between each response and its reference string. Results: For the first task of drug-formulation matching, GPT-4o had 49% accuracy for generic medications being matched to all available formulations, with an average of 1.23 omissions per medication and 1.14 hallucinations per medication. For the second task of drug-drug interaction identification, the accuracy was 54.7% for identifying the DDI pair. For the third task, GPT-4o generated order sentences containing no medication or abbreviation errors in 65.8% of cases. Conclusions: Model performance for basic medication tasks was consistently poor. This evaluation highlights the need for domain-specific training through clinician-annotated datasets and a comprehensive evaluation framework for benchmarking performance.
- North America > United States > Colorado > Adams County > Aurora (0.05)
- North America > United States > Georgia > Clarke County > Athens (0.05)
- North America > United States > Minnesota > Olmsted County > Rochester (0.04)
- (4 more...)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Nephrology (0.68)
Brittleness and Promise: Knowledge Graph Based Reward Modeling for Diagnostic Reasoning
Khatwani, Saksham, Cheng, He, Afshar, Majid, Dligach, Dmitriy, Gao, Yanjun
Large language models (LLMs) show promise for diagnostic reasoning but often lack reliable, knowledge grounded inference. Knowledge graphs (KGs), such as the Unified Medical Language System (UMLS), offer structured biomedical knowledge that can support trustworthy reasoning. Prior approaches typically integrate KGs via retrieval augmented generation or fine tuning, inserting KG content into prompts rather than enabling structured reasoning. We explore an alternative paradigm: treating the LLM as a reward model of KG reasoning paths, where the model learns to judge whether a candidate path leads to correct diagnosis for a given patient input. This approach is inspired by recent work that leverages reward training to enhance model reasoning abilities, and grounded in computational theory, which suggests that verifying a solution is often easier than generating one from scratch. It also parallels physicians' diagnostic assessment, where they judge which sequences of findings and intermediate conditions most plausibly support a diagnosis. We first systematically evaluate five task formulation for knowledge path judging and eight training paradigm. Second, we test whether the path judging abilities generalize to downstream diagnostic tasks, including diagnosis summarization and medical question answering. Experiments with three open source instruct-tuned LLMs reveal both promise and brittleness: while specific reward optimization and distillation lead to strong path-judging performance, the transferability to downstream tasks remain weak. Our finding provides the first systematic assessment of "reward model style" reasoning over clinical KGs, offering insights into how structured, reward-based supervision influences diagnostic reasoning in GenAI systems for healthcare.
- North America > United States > Wisconsin > Dane County > Madison (0.14)
- North America > United States > Colorado > Boulder County > Boulder (0.14)
- North America > United States > Colorado > Adams County > Aurora (0.04)
- (3 more...)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.88)
- Health & Medicine > Diagnostic Medicine (1.00)
- Health & Medicine > Therapeutic Area (0.68)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.66)
- Health & Medicine > Health Care Technology > Medical Record (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Standards in the Preparation of Biomedical Research Metadata: A Bridge2AI Perspective
Caufield, Harry, Ghosh, Satrajit, Kong, Sek Wong, Parker, Jillian, Sheffield, Nathan, Patel, Bhavesh, Williams, Andrew, Clark, Timothy, Munoz-Torres, Monica C.
AI-readiness describes the degree to which data may be optimally and ethically used for subsequent AI and Machine Learning (AI/ML) methods, where those methods may involve some combination of model training, data classification, and ethical, explainable prediction. The Bridge2AI consortium has defined the particular criteria a biomedical dataset may possess to render it AI-ready: in brief, a dataset's readiness is related to its FAIRness, provenance, degree of characterization, explainability, sustainability, and computability, in addition to its accompaniment with documentation about ethical data practices. To ensure AI-readiness and to clarify data structure and relationships within Bridge2AI's Grand Challenges (GCs), particular types of metadata are necessary. The GCs within the Bridge2AI initiative include four data-generating projects focusing on generating AI/ML-ready datasets to tackle complex biomedical and behavioral research problems. These projects develop standardized, multimodal data, tools, and training resources to support AI integration, while addressing ethical data practices. Examples include using voice as a biomarker, building interpretable genomic tools, modeling disease trajectories with diverse multimodal data, and mapping cellular and molecular health indicators across the human body. This report assesses the state of metadata creation and standardization in the Bridge2AI GCs, provides guidelines where required, and identifies gaps and areas for improvement across the program. New projects, including those outside the Bridge2AI consortium, would benefit from what we have learned about creating metadata as part of efforts to promote AI readiness.
- North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Massachusetts > Suffolk County > Boston (0.05)
- (8 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.68)
Identifying Neural Signatures from fMRI using Hybrid Principal Components Regression
Rieck, Jared, Wrobel, Julia, Gowin, Joshua L., Wang, Yue, Paulus, Martin, Peterson, Ryan
Recent advances in neuroimaging analysis have enabled accurate decoding of mental state from brain activation patterns during functional magnetic resonance imaging scans. A commonly applied tool for this purpose is principal components regression regularized with the least absolute shrinkage and selection operator (LASSO PCR), a type of multi-voxel pattern analysis (MVPA). This model presumes that all components are equally likely to harbor relevant information, when in fact the task-related signal may be concentrated in specific components. In such cases, the model will fail to select the optimal set of principal components that maximizes the total signal relevant to the cognitive process under study. Here, we present modifications to LASSO PCR that allow for a regularization penalty tied directly to the index of the principal component, reflecting a prior belief that task-relevant signal is more likely to be concentrated in components explaining greater variance. Additionally, we propose a novel hybrid method, Joint Sparsity-Ranked LASSO (JSRL), which integrates component-level and voxel-level activity under an information parity framework and imposes ranked sparsity to guide component selection. We apply the models to brain activation during risk taking, monetary incentive, and emotion regulation tasks. Results demonstrate that incorporating sparsity ranking into LASSO PCR produces models with enhanced classification performance, with JSRL achieving up to 51.7\% improvement in cross-validated deviance $R^2$ and 7.3\% improvement in cross-validated AUC. Furthermore, sparsity-ranked models perform as well as or better than standard LASSO PCR approaches across all classification tasks and allocate predictive weight to brain regions consistent with their established functional roles, offering a robust alternative for MVPA.
- North America > United States > Iowa > Johnson County > Iowa City (0.14)
- North America > United States > Oklahoma > Tulsa County > Tulsa (0.04)
- North America > United States > Colorado > Adams County > Aurora (0.04)
- (4 more...)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Technology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Enabling Down Syndrome Research through a Knowledge Graph-Driven Analytical Framework
Krishnamurthy, Madan, Saha, Surya, Lo, Pierrette, Whetzel, Patricia L., Issabekova, Tursynay, Vargas, Jamed Ferreris, DiGiovanna, Jack, Haendel, Melissa A
Trisomy 21 results in Down syndrome, a multifaceted genetic disorder with diverse clinical phenotypes, including heart defects, immune dysfunction, neurodevelopmental differences, and early-onset dementia risk. Heterogeneity and fragmented data across studies challenge comprehensive research and translational discovery. The NIH INCLUDE (INvestigation of Co-occurring conditions across the Lifespan to Understand Down syndromE) initiative has assembled harmonized participant-level datasets, yet realizing their potential requires integrative analytical frameworks. We developed a knowledge graph-driven platform transforming nine INCLUDE studies, comprising 7,148 participants, 456 conditions, 501 phenotypes, and over 37,000 biospecimens, into a unified semantic infrastructure. Cross-resource enrichment with Monarch Initiative data expands coverage to 4,281 genes and 7,077 variants. The resulting knowledge graph contains over 1.6 million semantic associations, enabling AI-ready analysis with graph embeddings and path-based reasoning for hypothesis generation. Researchers can query the graph via SPARQL or natural language interfaces. This framework converts static data repositories into dynamic discovery environments, supporting cross-study pattern recognition, predictive modeling, and systematic exploration of genotype-phenotype relationships in Down syndrome.
- North America > United States > Tennessee > Davidson County > Nashville (0.04)
- North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Therapeutic Area > Genetic Disease (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Tile and Slide : A New Framework for Scaling NeRF from Local to Global 3D Earth Observation
Billouard, Camille, Derksen, Dawa, Constantin, Alexandre, Vallet, Bruno
Neural Radiance Fields (NeRF) have recently emerged as a paradigm for 3D reconstruction from multiview satellite imagery. However, state-of-the-art NeRF methods are typically constrained to small scenes due to the memory footprint during training, which we study in this paper. Previous work on large-scale NeRFs palliate this by dividing the scene into NeRFs. This paper introduces Snake-NeRF, a framework that scales to large scenes. Our out-of-core method eliminates the need to load all images and networks simultaneously, and operates on a single device. We achieve this by dividing the region of interest into NeRFs that 3D tile without overlap. Importantly, we crop the images with overlap to ensure each NeRFs is trained with all the necessary pixels. We introduce a novel $2\times 2$ 3D tile progression strategy and segmented sampler, which together prevent 3D reconstruction errors along the tile edges. Our experiments conclude that large satellite images can effectively be processed with linear time complexity, on a single GPU, and without compromise in quality.
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- North America > United States > Colorado > Adams County (0.04)
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
Automating Expert-Level Medical Reasoning Evaluation of Large Language Models
Zhou, Shuang, Xie, Wenya, Li, Jiaxi, Zhan, Zaifu, Song, Meijia, Yang, Han, Espinoza, Cheyenna, Welton, Lindsay, Mai, Xinnie, Jin, Yanwei, Xu, Zidu, Chung, Yuen-Hei, Xing, Yiyun, Tsai, Meng-Han, Schaffer, Emma, Shi, Yucheng, Liu, Ninghao, Liu, Zirui, Zhang, Rui
As large language models (LLMs) become increasingly integrated into clinical decision-making, ensuring transparent and trustworthy reasoning is essential. However, existing evaluation strategies of LLMs' medical reasoning capability either suffer from unsatisfactory assessment or poor scalability, and a rigorous benchmark remains lacking. To address this, we introduce MedThink-Bench, a benchmark designed for rigorous, explainable, and scalable assessment of LLMs' medical reasoning. MedThink -Bench comprises 500 challenging questions across ten medical domains, each annotated with expert-crafted step-by -step rationales. Building on this, we propose LLM -w-Ref, a novel evaluation framework that leverages fine-grained rationales and LLM -as -a -Judge mechanisms to assess intermediate reasoning with expert -level fidelity while maintaining scalability. Experiments show that LLM -w-Ref exhibits a strong positive correlation with expert judgments. Benchmarking twelve state-of -the -art LLMs, we find that smaller models (e.g ., MedGemma-27B) can surpass larger proprietary counterparts (e.g., OpenAI -o3). Overall, MedThink-Bench offers a foundational tool for evaluating LLMs' medical reasoning, advancing their safe and responsible deployment in clinical practice.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.49)
- North America > United States > California > San Francisco County > San Francisco (0.28)
- North America > United States > New York > New York County > New York City (0.14)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)