opacity
- North America > Canada (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- North America > United States > Massachusetts > Middlesex County > Reading (0.04)
- (2 more...)
Comparative Evaluation of Generative AI Models for Chest Radiograph Report Generation in the Emergency Department
Lim, Woo Hyeon, Lee, Ji Young, Lee, Jong Hyuk, Kim, Saehoon, Kim, Hyungjin
Purpose: To benchmark open-source or commercial medical image-specific VLMs against real-world radiologist-written reports. Methods: This retrospective study included adult patients who presented to the emergency department between January 2022 and April 2025 and underwent same-day CXR and CT for febrile or respiratory symptoms. Reports from five VLMs (AIRead, Lingshu, MAIRA-2, MedGemma, and MedVersa) and radiologist-written reports were randomly presented and blindly evaluated by three thoracic radiologists using four criteria: RADPEER, clinical acceptability, hallucination, and language clarity. Comparative performance was assessed using generalized linear mixed models, with radiologist-written reports treated as the reference. Finding-level analyses were also performed with CT as the reference. Results: A total of 478 patients (median age, 67 years [interquartile range, 50-78]; 282 men [59.0%]) were included. AIRead demonstrated the lowest RADPEER 3b rate (5.3% [76/1434] vs. radiologists 13.9% [200/1434]; P<.001), whereas other VLMs showed higher disagreement rates (16.8-43.0%; P<.05). Clinical acceptability was the highest with AIRead (84.5% [1212/1434] vs. radiologists 74.3% [1065/1434]; P<.001), while other VLMs performed worse (41.1-71.4%; P<.05). Hallucinations were rare with AIRead, comparable to radiologists (0.3% [4/1425]) vs. 0.1% [1/1425]; P=.21), but frequent with other models (5.4-17.4%; P<.05). Language clarity was higher with AIRead (82.9% [1189/1434]), Lingshu (88.0% [1262/1434]), and MedVersa (88.4% [1268/1434]) compared with radiologists (78.1% [1120/1434]; P<.05). Sensitivity varied substantially across VLMs for the common findings: AIRead, 15.5-86.7%; Lingshu, 2.4-86.7%; MAIRA-2, 6.0-72.0%; MedGemma, 4.8-76.7%; and MedVersa, 20.2-69.3%. Conclusion: Medical VLMs for CXR report generation exhibited variable performance in report quality and diagnostic measures.
- Asia > South Korea > Seoul > Seoul (0.05)
- North America > United States (0.04)
- Europe > Switzerland > Basel-City > Basel (0.04)
- Asia > China (0.04)
- Health & Medicine > Nuclear Medicine (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.98)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.40)
Evaluating Reasoning Faithfulness in Medical Vision-Language Models using Multimodal Perturbations
Moll, Johannes, Graf, Markus, Lemke, Tristan, Lenhart, Nicolas, Truhn, Daniel, Delbrouck, Jean-Benoit, Pan, Jiazhen, Rueckert, Daniel, Adams, Lisa C., Bressem, Keno K.
Vision-language models (VLMs) often produce chain-of-thought (CoT) explanations that sound plausible yet fail to reflect the underlying decision process, undermining trust in high-stakes clinical use. Existing evaluations rarely catch this misalignment, prioritizing answer accuracy or adherence to formats. We present a clinically grounded framework for chest X-ray visual question answering (VQA) that probes CoT faithfulness via controlled text and image modifications across three axes: clinical fidelity, causal attribution, and confidence calibration. In a reader study (n=4), evaluator-radiologist correlations fall within the observed inter-radiologist range for all axes, with strong alignment for attribution (Kendall's $τ_b=0.670$), moderate alignment for fidelity ($τ_b=0.387$), and weak alignment for confidence tone ($τ_b=0.091$), which we report with caution. Benchmarking six VLMs shows that answer accuracy and explanation quality can be decoupled, acknowledging injected cues does not ensure grounding, and text cues shift explanations more than visual cues. While some open-source models match final answer accuracy, proprietary models score higher on attribution (25.0% vs. 1.4%) and often on fidelity (36.1% vs. 31.7%), highlighting deployment risks and the need to evaluate beyond final answer accuracy.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- (3 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.68)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Nuclear Medicine (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Modeling Clinical Uncertainty in Radiology Reports: from Explicit Uncertainty Markers to Implicit Reasoning Pathways
Rabaey, Paloma, Moon, Jong Hak, Lee, Jung-Oh, Kim, Min Gwan, Yoon, Hangyul, Demeester, Thomas, Choi, Edward
Radiology reports are invaluable for clinical decision-making and hold great potential for automated analysis when structured into machine-readable formats. These reports often contain uncertainty, which we categorize into two distinct types: (i) Explicit uncertainty reflects doubt about the presence or absence of findings, conveyed through hedging phrases. These vary in meaning depending on the context, making rule-based systems insufficient to quantify the level of uncertainty for specific findings; (ii) Implicit uncertainty arises when radiologists omit parts of their reasoning, recording only key findings or diagnoses. Here, it is often unclear whether omitted findings are truly absent or simply unmentioned for brevity. We address these challenges with a two-part framework. We quantify explicit uncertainty by creating an expert-validated, LLM-based reference ranking of common hedging phrases, and mapping each finding to a probability value based on this reference. In addition, we model implicit uncertainty through an expansion framework that systematically adds characteristic sub-findings derived from expert-defined diagnostic pathways for 14 common diagnoses. Using these methods, we release Lunguage++, an expanded, uncertainty-aware version of the Lunguage benchmark of fine-grained structured radiology reports. This enriched resource enables uncertainty-aware image classification, faithful diagnostic reasoning, and new investigations into the clinical impact of diagnostic uncertainty.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
- Research Report (1.00)
- Workflow (0.67)
- Health & Medicine > Nuclear Medicine (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- North America > Canada (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- North America > United States > Oklahoma > Beaver County (0.04)
- (3 more...)
LVT: Large-Scale Scene Reconstruction via Local View Transformers
Imtiaz, Tooba, Chai, Lucy, Heal, Kathryn, Luo, Xuan, Park, Jungyeon, Dy, Jennifer, Flynn, John
Large transformer models are proving to be a powerful tool for 3D vision and novel view synthesis. However, the standard Transformer's well-known quadratic complexity makes it difficult to scale these methods to large scenes. To address this challenge, we propose the Local View Transformer (LVT), a large-scale scene reconstruction and novel view synthesis architecture that circumvents the need for the quadratic attention operation. Motivated by the insight that spatially nearby views provide more useful signal about the local scene composition than distant views, our model processes all information in a local neighborhood around each view. To attend to tokens in nearby views, we leverage a novel positional encoding that conditions on the relative geometric transformation between the query and nearby views. We decode the output of our model into a 3D Gaussian Splat scene representation that includes both color and opacity view-dependence. Taken together, the Local View Transformer enables reconstruction of arbitrarily large, high-resolution scenes in a single forward pass. See our project page for results and interactive demos https://toobaimt.github.io/lvt/.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > China > Hong Kong (0.07)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- (2 more...)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Deep opacity and AI: A threat to XAI and to privacy protection mechanisms
It is known that big data analytics and AI pose a threat to privacy, and that some of this is due to some kind of "black box problem" in AI. I explain how this becomes a problem in the context of justification for judgments and actions. Furthermore, I suggest distinguishing three kinds of opacity: 1) the subjects do not know what the system does ("shallow opacity"), 2) the analysts do not know what the system does ("standard black box opacity"), or 3) the analysts cannot possibly know what the system might do ("deep opacity"). If the agents, data subjects as well as analytics experts, operate under opacity, then these agents cannot provide justifications for judgments that are necessary to protect privacy, e.g., they cannot give "informed consent", or guarantee "anonymity". It follows from these points that agents in big data analytics and AI often cannot make the judgments needed to protect privacy. So I conclude that big data analytics makes the privacy problems worse and the remedies less effective. As a positive note, I provide a brief outlook on technical ways to handle this situation.
- North America > United States > New York (0.04)
- North America > United States > California > San Mateo County > Menlo Park (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Africa > South Africa (0.04)
Explainability of Algorithms
The opaqueness of many complex machine learning algorithms is often mentioned as one of the main obstacles to the ethical development of artificial intelligence (AI). But what does it mean for an algorithm to be opaque? Highly complex algorithms such as artificial neural networks process enormous volumes of data in parallel along multiple hidden layers of interconnected nodes, rendering their inner workings epistemically inaccessible to any human being, including their designers and developers; they are "black boxes" for all their stakeholders. But opaqueness is not always the inevitable result of technical complexity. Sometimes, the way an algorithm works is intentionally hidden from view for proprietary reasons, especially in commercial automated decision systems, creating an entirely different type of opaqueness. In the first part of the chapter, we will examine these two ways of understanding opacity and the ethical implications that stem from each of them. In the second part, we explore the different explanatory methods that have been developed in computer science to overcome an AI system's technical opaqueness. As the analysis shows, explainable AI (XAI) still faces numerous challenges.
- North America > United States > New York (0.05)
- South America > Colombia > Bogotá D.C. > Bogotá (0.04)
- North America > United States > Wisconsin (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- Government (0.93)
Noosemia: toward a Cognitive and Phenomenological Account of Intentionality Attribution in Human-Generative AI Interaction
De Santis, Enrico, Rizzi, Antonello
This paper introduces and formalizes Noosemìa, a novel cognitive-phenomenological pattern emerging from human interaction with generative AI systems, particularly those enabling dialogic or multimodal exchanges. We propose a multidisciplinary framework to explain how, under certain conditions, users attribute intentionality, agency, and even interiority to these systems - a process grounded not in physical resemblance, but in linguistic performance, epistemic opacity, and emergent technological complexity. By linking an LLM declination of meaning holism to our technical notion of the LLM Contextual Cognitive Field, we clarify how LLMs construct meaning relationally and how coherence and a simulacrum of agency arise at the human-AI interface. The analysis situates noosemia alongside pareidolia, animism, the intentional stance and the uncanny valley, distinguishing its unique characteristics. We also introduce a-noosemia to describe the phenomenological withdrawal of such projections. The paper concludes with reflections on the broader philosophical, epistemological and social implications of noosemic dynamics and directions for future research.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (6 more...)
- Research Report (1.00)
- Overview (1.00)
- Information Technology (1.00)
- Government (1.00)
- Media (0.92)
- Health & Medicine > Therapeutic Area > Neurology (0.67)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)
UGOD: Uncertainty-Guided Differentiable Opacity and Soft Dropout for Enhanced Sparse-View 3DGS
Guo, Zhihao, Wang, Peng, Chen, Zidong, Kong, Xiangyu, Lyu, Yan, Gao, Guanyu, Han, Liangxiu
However, Gaussians are treated equally weighted for rendering in most 3DGS methods, making them prone to overfitting, which is particularly the case in sparse-view scenarios. To address this, we investigate how adaptive weighting of Gaussians affects rendering quality, which is charac-terised by learned uncertainties proposed. This learned uncertainty serves two key purposes: first, it guides the differentiable update of Gaussian opacity while preserving the 3DGS pipeline integrity; second, the uncertainty undergoes soft differentiable dropout regularisation, which strategically transforms the original uncertainty into continuous drop probabilities that govern the final Gaussian projection and blending process for rendering. Extensive experimental results over widely adopted datasets demonstrate that our method outperforms rivals in sparse-view 3D synthesis, achieving higher quality reconstruction with fewer Gaussians in most datasets compared to existing sparse-view approaches, e.g., compared to DropGaussian, our method achieves 3.27% PSNR improvements on the MipNeRF 360 dataset.