AITopics | diagnostic task

Collaborating Authors

diagnostic task

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CXReasonBench: ABenchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays

Neural Information Processing SystemsJun-18-2026, 23:17:58 GMT

Recent progress in Large Vision-Language Models (LVLMs) has enabled promising applications in medical tasks, such as report generation and visual question answering. However, existing benchmarks focus mainly on the final diagnostic answer, offering limited insight into whether models engage in clinically meaningful reasoning. To address this, we present CheXStruct and CXReasonBench, a structured pipeline and benchmark built on the publicly available MIMIC-CXR-JPG dataset. CheXStruct automatically derives a sequence of intermediate reasoning steps directly from chest X-rays, such as segmenting anatomical regions, deriving anatomical landmarks and diagnostic measurements, computing diagnostic indices, and applying clinical thresholds. CXReasonBench leverages this pipeline to evaluate whether models can perform clinically valid reasoning steps and to what extent they can learn from structured guidance, enabling fine-grained and transparent assessment of diagnostic reasoning. The benchmark comprises 18,988 QA pairs across 12 diagnostic tasks and 1,200 cases, each paired with up to 4 visual inputs, and supports multi-path, multi-stage evaluation including visual grounding via anatomical region selection and diagnostic measurements. Even the strongest of 12 evaluated LVLMs struggle with structured reasoning and generalization, often failing to link abstract knowledge with anatomically grounded visual interpretation.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.27)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Root Cause Analysis for Microservice Systems via Cascaded Conditional Learning with Hypergraphs

Xie, Shuaiyu, He, Hanbin, Wang, Jian, Li, Bing

arXiv.org Artificial IntelligenceNov-25-2025

Abstract--Root cause analysis in microservice systems typically involves two core tasks: root cause localization (RCL) and failure type identification (FTI). Despite substantial research efforts, conventional diagnostic approaches still face two key challenges. First, these methods predominantly adopt a joint learning paradigm for RCL and FTI to exploit shared information and reduce training time. Second, these existing methods primarily focus on point-to-point relationships between instances, overlooking the group nature of inter-instance influences induced by deployment configurations and load balancing. T o overcome these limitations, we propose CCLH, a novel root cause analysis framework that orchestrates diagnostic tasks based on cascaded conditional learning. CCLH provides a three-level taxonomy for group influences between instances and incorporates a heterogeneous hypergraph to model these relationships, facilitating the simulation of failure propagation. Extensive experiments conducted on datasets from three mi-croservice benchmarks demonstrate that CCLH outperforms state-of-the-art methods in both RCL and FTI. Microservice architecture has been widely adopted by cloud-native enterprises due to its flexibility, scalability, and loose coupling. In microservice systems (MSS), each microser-vice typically reproduces multiple instances, which collaborate with instances affiliated with other microservices to handle user requests [1], [2]. As these systems scale up, they may suffer from reliability issues, aka failures, attributable to the increasing complexity and dynamicity. Worse still, diagnosing failures in microservice systems is labor-intensive and time-consuming, due to the intricate failure propagation and the overwhelming volume of telemetry data. For example, GitHub once took approximately one and a half hours to resolve a failure that disrupted the codespace service, affecting millions of developers and repositories [3]. Traditional root cause analysis (RCA) in MSS encompasses two tasks: root cause localization (RCL) and failure type identification (FTI).

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.17566

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays

Lee, Hyungyung, Choi, Geon, Lee, Jung-Oh, Yoon, Hangyul, Hong, Hyuk Gi, Choi, Edward

arXiv.org Artificial IntelligenceOct-28-2025

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.18087

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

A Survey of the Impact of Self-Supervised Pretraining for Diagnostic Tasks with Radiological Images

VanBerlo, Blake, Hoey, Jesse, Wong, Alexander

arXiv.org Artificial IntelligenceSep-5-2023

Self-supervised pretraining has been observed to be effective at improving feature representations for transfer learning, leveraging large amounts of unlabelled data. This review summarizes recent research into its usage in X-ray, computed tomography, magnetic resonance, and ultrasound imaging, concentrating on studies that compare self-supervised pretraining to fully supervised learning for diagnostic tasks such as classification and segmentation. The most pertinent finding is that self-supervised pretraining generally improves downstream task performance compared to full supervision, most prominently when unlabelled examples greatly outnumber labelled examples. Based on the aggregate evidence, recommendations are provided for practitioners considering using self-supervised learning. Motivated by limitations identified in current research, directions and practices for future study are suggested, such as integrating clinical knowledge with theoretically justified self-supervised learning methods, evaluating on public datasets, growing the modest body of evidence for ultrasound, and characterizing the impact of self-supervised pretraining on generalization.

diagnostic task, radiological image, self-supervised pretraining

arXiv.org Artificial Intelligence

2309.02555

Genre:

Overview (1.00)
Research Report (0.69)

Industry:

Health & Medicine > Nuclear Medicine (0.40)
Health & Medicine > Diagnostic Medicine > Imaging (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.93)

Add feedback

Measuring Perceived Trust in XAI-Assisted Decision-Making by Eliciting a Mental Model

Onari, Mohsen Abbaspour, Grau, Isel, Nobile, Marco S., Zhang, Yingqian

arXiv.org Artificial IntelligenceJul-15-2023

This empirical study proposes a novel methodology to measure users' perceived trust in an Explainable Artificial Intelligence (XAI) model. To do so, users' mental models are elicited using Fuzzy Cognitive Maps (FCMs). First, we exploit an interpretable Machine Learning (ML) model to classify suspected COVID-19 patients into positive or negative cases. Then, Medical Experts' (MEs) conduct a diagnostic decision-making task based on their knowledge and then prediction and interpretations provided by the XAI model. In order to evaluate the impact of interpretations on perceived trust, explanation satisfaction attributes are rated by MEs through a survey. Then, they are considered as FCM's concepts to determine their influences on each other and, ultimately, on the perceived trust. Moreover, to consider MEs' mental subjectivity, fuzzy linguistic variables are used to determine the strength of influences. After reaching the steady state of FCMs, a quantified value is obtained to measure the perceived trust of each ME. The results show that the quantified values can determine whether MEs trust or distrust the XAI model. We analyze this behavior by comparing the quantified values with MEs' performance in completing diagnostic tasks.

artificial intelligence, machine learning, mental model, (20 more...)

arXiv.org Artificial Intelligence

2307.11765

Country:

Europe > Netherlands > North Brabant > Eindhoven (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
Europe > Italy > Veneto > Venice (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.87)

Add feedback

$\rm{C {\small IS}}^2$: A Simplified Commonsense Inference Evaluation for Story Prose

Li, Bryan, Martin, Lara J., Callison-Burch, Chris

arXiv.org Artificial IntelligenceOct-19-2022

Transformers have been showing near-human performance on a variety of tasks, but they are not without their limitations. We discuss the issue of conflating results of transformers that are instructed to do multiple tasks simultaneously. In particular, we focus on the domain of commonsense reasoning within story prose, which we call contextual commonsense inference (CCI). We look at the GLUCOSE (Mostafazadeh et al. 2020) dataset and task for predicting implicit commonsense inferences between story sentences. Since the GLUCOSE task simultaneously generates sentences and predicts the CCI relation, there is a conflation in the results. Is the model really measuring CCI or is its ability to generate grammatical text carrying the results? In this paper, we introduce the task contextual commonsense inference in sentence selection ($\rm{C {\small IS}}^2$), a simplified task that avoids conflation by eliminating language generation altogether. Our findings emphasize the necessity of future work to disentangle language generation from the desired NLP tasks at hand.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2202.0788

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.56)
Information Technology > Communications > Social Media > Crowdsourcing (0.46)

Add feedback

How should I compute my candidates? A taxonomy and classification of diagnosis computation algorithms

Rodler, Patrick

arXiv.org Artificial IntelligenceJul-25-2022

This work proposes a taxonomy for diagnosis computation methods which allows their standardized assessment, classification and comparison. The aim is to (i) give researchers and practitioners an impression of the diverse landscape of available diagnostic techniques, (ii) allow them to easily retrieve the main features as well as pros and cons of the approaches, (iii) enable an easy and clear comparison of the techniques based on their characteristics wrt. a list of important and well-defined properties, and (iv) facilitate the selection of the "right" algorithm to adopt for a particular problem case, e.g., in practical diagnostic settings, for comparison in experimental evaluations, or for reuse, modification, extension, or improvement in the course of research.

algorithm, computation, diagnosis, (16 more...)

arXiv.org Artificial Intelligence

2207.12583

Country: North America > United States > Maryland > Prince George's County > College Park (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.68)

Add feedback

Deep Learning Applied to Chest X-Rays: Exploiting and Preventing Shortcuts

Jabbour, Sarah, Fouhey, David, Kazerooni, Ella, Sjoding, Michael W., Wiens, Jenna

arXiv.org Artificial IntelligenceSep-21-2020

While deep learning has shown promise in improving the automated diagnosis of disease based on chest X-rays, deep networks may exhibit undesirable behavior related to shortcuts. This paper studies the case of spurious class skew in which patients with a particular attribute are spuriously more likely to have the outcome of interest. For instance, clinical protocols might lead to a dataset in which patients with pacemakers are disproportionately likely to have congestive heart failure. This skew can lead to models that take shortcuts by heavily relying on the biased attribute. We explore this problem across a number of attributes in the context of diagnosing the cause of acute hypoxemic respiratory failure. Applied to chest X-rays, we show that i) deep nets can accurately identify many patient attributes including sex (AUROC = 0.96) and age (AUROC >= 0.90), ii) they tend to exploit correlations between such attributes and the outcome label when learning to predict a diagnosis, leading to poor performance when such correlations do not hold in the test population (e.g., everyone in the test set is male), and iii) a simple transfer learning approach is surprisingly effective at preventing the shortcut and promoting good generalization performance. On the task of diagnosing congestive heart failure based on a set of chest X-rays skewed towards older patients (age >= 63), the proposed approach improves generalization over standard training from 0.66 (95% CI: 0.54-0.77) to 0.84 (95% CI: 0.73-0.92) AUROC. While simple, the proposed approach has the potential to improve the performance of models across populations by encouraging reliance on clinically relevant manifestations of disease, i.e., those that a clinician would use to make a diagnosis.

artificial intelligence, machine learning, shortcut, (18 more...)

arXiv.org Artificial Intelligence

2009.10132

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
Europe (0.04)
Asia (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

Add feedback

A neural network walks into a lab: towards using deep nets as models for human behavior

Ma, Wei Ji, Peters, Benjamin

arXiv.org Artificial IntelligenceMay-2-2020

What might sound like the beginning of a joke has become an attractive prospect for many cognitive scientists: the use of deep neural network models (DNNs) as models of human behavior in perceptual and cognitive tasks. Although DNNs have taken over machine learning, attempts to use them as models of human behavior are still in the early stages. Can they become a versatile model class in the cognitive scientist's toolbox? We first argue why DNNs have the potential to be interesting models of human behavior. We then discuss how that potential can be more fully realized. On the one hand, we argue that the cycle of training, testing, and revising DNNs needs to be revisited through the lens of the cognitive scientist's goals. Specifically, we argue that methods for assessing the goodness of fit between DNN models and human behavior have to date been impoverished. On the other hand, cognitive science might have to start using more complex tasks (including richer stimulus spaces), but doing so might be beneficial for DNN-independent reasons as well. Finally, we highlight avenues where traditional cognitive process models and DNNs may show productive synergy.

artificial intelligence, dnn, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2005.02181

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Leisure & Entertainment > Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Towards a Taxonomy of Problem Solving Types

Chandrasekaran, B.

AI MagazineMar-15-1983

Our group's work in medical decision making has led us to formulate a framework for expert system design, in particular about how the domain knowledge may be decomposed into substructures. We propose that there exist different problem-solving types, i.e., uses of knowledge, and corresponding to each is a separate substructure specializing in that type of problem-solving. Each substructure is in turn further decomposed into a hierarchy of specialist which differ from each other not in the type of problem-solving, but in the conceptual content of their knowledge; e.g.; one of them may specialize in "heart disease," while another may do so in "liver," though both of them are doing the same type of problem solving. Thus ultimately all the knowledge in the system is distributed among problem-solvers which know how to use that knowledge. This is in contrast to the currently dominant expert system paradigm which proposes a common knowledge base accessed by knowledge-free problem-solvers of various kinds. In our framework there is no distinction between knowledge bases and problem-solvers: each knowledge source is a problem-solver. We have so far had occasion to deal with three generic problem-solving types in expert clinical reasoning: diagnosis (classification), data retrieval and organization, and reasoning about consequences of actions. In novice, these expert structures are often incomplete, and other knowledge structures and learning processes are needed to construct and complete them.

artificial intelligence, expert system, knowledge, (19 more...)

AI Magazine

Country:

North America > United States > Ohio > Franklin County > Columbus (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback