AITopics | x-ray image

Collaborating Authors

x-ray image

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

R2MF-Net: A Recurrent Residual Multi-Path Fusion Network for Robust Multi-directional Spine X-ray Segmentation

Li, Xuecheng, Jia, Weikuan, Sharipov, Komildzhon, Beknazarovich, Sharipov Hotam, Ataeva, Farzona S., Alisher, Qurbonaliev, Zheng, Yuanjie

arXiv.org Artificial IntelligenceDec-9-2025

Accurate segmentation of spinal structures in X-ray images is a prerequisite for quantitative scoliosis assessment, including Cobb angle measurement, vertebral translation estimation and curvature classification. In routine practice, clinicians acquire coronal, left-bending and right-bending radiographs to jointly evaluate deformity severity and spinal flexibility. However, the segmentation step remains heavily manual, time-consuming and non-reproducible, particularly in low-contrast images and in the presence of rib shadows or overlapping tissues. To address these limitations, this paper proposes R2MF-Net, a recurrent residual multi-path encoder--decoder network tailored for automatic segmentation of multi-directional spine X-ray images. The overall design consists of a coarse segmentation network and a fine segmentation network connected in cascade. Both stages adopt an improved Inception-style multi-branch feature extractor, while a recurrent residual jump connection (R2-Jump) module is inserted into skip paths to gradually align encoder and decoder semantics. A multi-scale cross-stage skip (MC-Skip) mechanism allows the fine network to reuse hierarchical representations from multiple decoder levels of the coarse network, thereby strengthening the stability of segmentation across imaging directions and contrast conditions. Furthermore, a lightweight spatial-channel squeeze-and-excitation block (SCSE-Lite) is employed at the bottleneck to emphasize spine-related activations and suppress irrelevant structures and background noise. We evaluate R2MF-Net on a clinical multi-view radiograph dataset comprising 228 sets of coronal, left-bending and right-bending spine X-ray images with expert annotations.

artificial intelligence, machine learning, segmentation, (20 more...)

arXiv.org Artificial Intelligence

2512.07576

Country:

Asia > Tajikistan (0.04)
Asia > China (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.86)
(2 more...)

Add feedback

3D Path Planning for Robot-assisted Vertebroplasty from Arbitrary Bi-plane X-ray via Differentiable Rendering

Inigo, Blanca, Killeen, Benjamin D., Choi, Rebecca, Song, Michelle, Uneri, Ali, Khan, Majid, Bailey, Christopher, Krieger, Axel, Unberath, Mathias

arXiv.org Artificial IntelligenceDec-8-2025

Robotic systems are transforming image-guided interventions by enhancing accuracy and minimizing radiation exposure. A significant challenge in robotic assistance lies in surgical path planning, which often relies on the registration of intraoperative 2D images with preoperative 3D CT scans. This requirement can be burdensome and costly, particularly in procedures like vertebroplasty, where preoperative CT scans are not routinely performed. To address this issue, we introduce a differentiable rendering-based framework for 3D transpedicular path planning utilizing bi-planar 2D X-rays. Our method integrates differentiable rendering with a vertebral atlas generated through a Statistical Shape Model (SSM) and employs a learned similarity loss to refine the SSM shape and pose dynamically, independent of fixed imaging geometries. We evaluated our framework in two stages: first, through vertebral reconstruction from orthogonal X-rays for benchmarking, and second, via clinician-in-the-loop path planning using arbitrary-view X-rays. Our results indicate that our method outperformed a normalized cross-correlation baseline in reconstruction metrics (DICE: 0.75 vs. 0.65) and achieved comparable performance to the state-of-the-art model ReVerteR (DICE: 0.77), while maintaining generalization to arbitrary views. Success rates for bipedicular planning reached 82% with synthetic data and 75% with cadaver data, exceeding the 66% and 31% rates of a 2D-to-3D baseline, respectively. In conclusion, our framework facilitates versatile, CT-free 3D path planning for robot-assisted vertebroplasty, effectively accommodating real-world imaging diversity without the need for preoperative CT scans.

artificial intelligence, machine learning, path planning, (20 more...)

arXiv.org Artificial Intelligence

2512.05803

Country:

North America > United States > New Mexico (0.05)
North America > United States > Maryland > Baltimore (0.04)
Europe > Switzerland (0.04)
North America > United States > Washington > Whatcom County > Bellingham (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

An Integrated System for WEEE Sorting Employing X-ray Imaging, AI-based Object Detection and Segmentation, and Delta Robot Manipulation

Giannikos, Panagiotis, Papakostas, Lampis, Katralis, Evangelos, Mavridis, Panagiotis, Chryssinas, George, Inglezou, Myrto, Panagopoulos, Nikolaos, Porichis, Antonis, Mastrogeorgiou, Athanasios, Chatzakos, Panagiotis

arXiv.org Artificial IntelligenceDec-8-2025

Abstract-- Battery recycling is becoming increasingly critical due to the rapid growth in battery usage and the limited availability of natural resources. Moreover, as battery energy densities continue to rise, improper handling during recycling poses significant safety hazards, including potential fires at recycling facilities. Numerous systems have been proposed for battery detection and removal from WEEE recycling lines, including X-ray and RGB-based visual inspection methods, typically driven by AI-powered object detection models (e.g., Mask R-CNN, YOLO, ResNets). Despite advances in optimizing detection techniques and model modifications, a fully autonomous solution capable of accurately identifying and sorting batteries across diverse WEEEs types has yet to be realized. In response to these challenges, we present our novel approach which integrates a specialized X-ray transmission dual energy imaging subsystem with advanced pre-processing algorithms, enabling high-contrast image reconstruction for effective differentiation of dense and thin materials in WEEE. Devices move along a conveyor belt through a high-resolution X-ray imaging system, where YOLO and U-Net models precisely detect and segment battery-containing items. An intelligent tracking and position estimation algorithm then guides a Delta robot equipped with a suction gripper to selectively extract and properly discard the targeted devices. The approach is validated in a photorealistic simulation environment developed in NVIDIA Isaac Sim and on the real setup.

artificial intelligence, battery, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/AIM64088.2025.11175846

2512.05599

Country:

Europe > United Kingdom (0.04)
Europe > Sweden (0.04)
Asia > Afghanistan > Herat Province > Herat (0.04)

Genre: Research Report (0.70)

Industry:

Energy > Energy Storage (1.00)
Electrical Industrial Apparatus (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Automatic Multi-View X-Ray/CT Registration Using Bone Substructure Contours

Flepp, Roman, Nissen, Leon, Sigrist, Bastian, Nieuwland, Arend, Cavalcanti, Nicola, Fürnstahl, Philipp, Dreher, Thomas, Calvet, Lilian

arXiv.org Artificial IntelligenceNov-25-2025

Purpose: Accurate intraoperative X-ray/CT registration is essential for surgical navigation in orthopedic procedures. However, existing methods struggle with consistently achieving sub-millimeter accuracy, robustness under broad initial pose estimates or need manual key-point annotations. This work aims to address these challenges by proposing a novel multi-view X-ray/CT registration method for intraoperative bone registration. Methods: The proposed registration method consists of a multi-view, contour-based iterative closest point (ICP) optimization. Unlike previous methods, which attempt to match bone contours across the entire silhouette in both imaging modalities, we focus on matching specific subcategories of contours corresponding to bone substructures. This leads to reduced ambiguity in the ICP matches, resulting in a more robust and accurate registration solution. This approach requires only two X-ray images and operates fully automatically. Additionally, we contribute a dataset of 5 cadaveric specimens, including real X-ray images, X-ray image poses and the corresponding CT scans. Results: The proposed registration method is evaluated on real X-ray images using mean reprojection error (mRPD). The method consistently achieves sub-millimeter accuracy with a mRPD 0.67mm compared to 5.35mm by a commercial solution requiring manual intervention. Furthermore, the method offers improved practical applicability, being fully automatic. Conclusion: Our method offers a practical, accurate, and efficient solution for multi-view X-ray/CT registration in orthopedic surgeries, which can be easily combined with tracking systems. By improving registration accuracy and minimizing manual intervention, it enhances intraoperative navigation, contributing to more accurate and effective surgical outcomes in computer-assisted surgery (CAS).

artificial intelligence, machine learning, registration, (19 more...)

arXiv.org Artificial Intelligence

2506.13292

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Analysing Environmental Efficiency in AI for X-Ray Diagnosis

Kearns, Liam

arXiv.org Artificial IntelligenceNov-12-2025

The integration of AI tools into medical applications has aimed to improve the efficiency of diagnosis. The emergence of large language models (LLMs), such as ChatGPT and Claude, has expanded this integration even further. Because of LLM versatility and ease of use through APIs, these larger models are often utilised even though smaller, custom models can be used instead. In this paper, LLMs and small discriminative models are integrated into a Mendix application to detect Covid-19 in chest X-rays. These discriminative models are also used to provide knowledge bases for LLMs to improve accuracy. This provides a benchmark study of 14 different model configurations for comparison of accuracy and environmental impact. The findings indicated that while smaller models reduced the carbon footprint of the application, the output was biased towards a positive diagnosis and the output probabilities were lacking confidence. Meanwhile, restricting LLMs to only give probabilistic output caused poor performance in both accuracy and carbon footprint, demonstrating the risk of using LLMs as a universal AI solution. While using the smaller LLM GPT-4.1-Nano reduced the carbon footprint by 94.2% compared to the larger models, this was still disproportionate to the discriminative models; the most efficient solution was the Covid-Net model. Although it had a larger carbon footprint than other small models, its carbon footprint was 99.9% less than when using GPT-4.5-Preview, whilst achieving an accuracy of 95.5%, the highest of all models examined. This paper contributes to knowledge by comparing generative and discriminative models in Covid-19 detection as well as highlighting the environmental risk of using generative tools for classification tasks.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.07436

Country: Europe > United Kingdom > England > Worcestershire (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Integrating Visual and X-Ray Machine Learning Features in the Study of Paintings by Goya

Ugail, Hassan, Jaleel, Ismail Lujain

arXiv.org Artificial IntelligenceNov-4-2025

Art authentication of Francisco Goya's works presents complex computational challenges due to his heterogeneous stylistic evolution and extensive historical patterns of forgery. We introduce a novel multimodal machine learning framework that applies identical feature extraction techniques to both visual and X-ray radiographic images of Goya paintings. The unified feature extraction pipeline incorporates Grey-Level Co-occurrence Matrix descriptors, Local Binary Patterns, entropy measures, energy calculations, and colour distribution analysis applied consistently across both imaging modalities. The extracted features from both visual and X-ray images are processed through an optimised One-Class Support Vector Machine with hyperparameter tuning. Using a dataset of 24 authenticated Goya paintings with corresponding X-ray images, split into an 80/20 train-test configuration with 10-fold cross-validation, the framework achieves 97.8% classification accuracy with a 0.022 false positive rate. Case study analysis of ``Un Gigante'' demonstrates the practical efficacy of our pipeline, achieving 92.3% authentication confidence through unified multimodal feature analysis. Our results indicate substantial performance improvement over single-modal approaches, establishing the effectiveness of applying identical computational methods to both visual and radiographic imagery in art authentication applications.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2511.01

Country:

Europe > United Kingdom > England > West Yorkshire > Bradford (0.04)
Europe > Spain > Aragón > Zaragoza Province > Zaragoza (0.04)
Asia > Malaysia > Kuala Lumpur > Kuala Lumpur (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Deep Learning in Dental Image Analysis: A Systematic Review of Datasets, Methodologies, and Emerging Challenges

Zhou, Zhenhuan, Zhu, Jingbo, Zhang, Yuchen, Guan, Xiaohang, Wang, Peng, Li, Tao

arXiv.org Artificial IntelligenceOct-24-2025

Efficient analysis and processing of dental images are crucial for dentists to achieve accurate diagnosis and optimal treatment planning. However, dental imaging inherently poses several challenges, such as low contrast, metallic artifacts, and variations in projection angles. Combined with the subjectivity arising from differences in clinicians' expertise, manual interpretation often proves time-consuming and prone to inconsistency. Artificial intelligence (AI)-based automated dental image analysis (DIA) offers a promising solution to these issues and has become an integral part of computer-aided dental diagnosis and treatment. Among various AI technologies, deep learning (DL) stands out as the most widely applied and influential approach due to its superior feature extraction and representation capabilities. To comprehensively summarize recent progress in this field, we focus on the two fundamental aspects of DL research-datasets and models. In this paper, we systematically review 260 studies on DL applications in DIA, including 49 papers on publicly available dental datasets and 211 papers on DL-based algorithms. We first introduce the basic concepts of dental imaging and summarize the characteristics and acquisition methods of existing datasets. Then, we present the foundational techniques of DL and categorize relevant models and algorithms according to different DIA tasks, analyzing their network architectures, optimization strategies, training methods, and performance. Furthermore, we summarize commonly used training and evaluation metrics in the DIA domain. Finally, we discuss the current challenges of existing research and outline potential future directions. We hope that this work provides a valuable and systematic reference for researchers in this field. All supplementary materials and detailed comparison tables will be made publicly available on GitHub.

data mining, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

2510.20634

Country:

Asia > China > Tianjin Province > Tianjin (0.04)
Asia > Middle East > Iran (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
(19 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Therapeutic Area > Dental and Oral Health (1.00)
Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Health Care Technology (1.00)
(3 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(4 more...)

Add feedback

TemMed-Bench: Evaluating Temporal Medical Image Reasoning in Vision-Language Models

Zhang, Junyi, Gu, Jia-Chen, Hu, Wenbo, Zhou, Yu, Piramuthu, Robinson, Peng, Nanyun

arXiv.org Artificial IntelligenceSep-30-2025

Existing medical reasoning benchmarks for vision-language models primarily focus on analyzing a patient's condition based on an image from a single visit. However, this setting deviates significantly from real-world clinical practice, where doctors typically refer to a patient's historical conditions to provide a comprehensive assessment by tracking their changes over time. In this paper, we introduce TemMed-Bench, the first benchmark designed for analyzing changes in patients' conditions between different clinical visits, which challenges large vision-language models (LVLMs) to reason over temporal medical images. TemMed-Bench consists of a test set comprising three tasks - visual question-answering (VQA), report generation, and image-pair selection - and a supplementary knowledge corpus of over 17,000 instances. With TemMed-Bench, we conduct an evaluation of six proprietary and six open-source LVLMs. Our results show that most LVLMs lack the ability to analyze patients' condition changes over temporal medical images, and a large proportion perform only at a random-guessing level in the closed-book setting. In contrast, GPT o3, o4-mini and Claude 3.5 Sonnet demonstrate comparatively decent performance, though they have yet to reach the desired level. Furthermore, we explore augmenting the input with both retrieved visual and textual modalities in the medical domain. We also show that multi-modal retrieval augmentation yields notably higher performance gains than no retrieval and textual retrieval alone across most models on our benchmark, with the VQA task showing an average improvement of 2.59%. Overall, we compose a benchmark grounded on real-world clinical practice, and it reveals LVLMs' limitations in temporal medical image reasoning, as well as highlighting the use of multi-modal retrieval augmentation as a potentially promising direction worth exploring to address this challenge.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2509.25143

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(9 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

LatXGen: Towards Radiation-Free and Accurate Quantitative Analysis of Sagittal Spinal Alignment Via Cross-Modal Radiographic View Synthesis

Zhao, Moxin, Meng, Nan, Cheung, Jason Pui Yin, Tang, Chris Yuk Kwan, Yu, Chenxi, Zhong, Wenting, Lu, Pengyu, Shi, Chang, Zhuang, Yipeng, Zhang, Teng

arXiv.org Artificial IntelligenceSep-30-2025

Adolescent Idiopathic Scoliosis (AIS) is a complex three-dimensional spinal deformity, and accurate morphological assessment requires evaluating both coronal and sagittal alignment. While previous research has made significant progress in developing radiation-free methods for coronal plane assessment, reliable and accurate evaluation of sagittal alignment without ionizing radiation remains largely underexplored. To address this gap, we propose LatXGen, a novel generative framework that synthesizes realistic lateral spinal radiographs from posterior Red-Green-Blue and Depth (RGBD) images of unclothed backs. This enables accurate, radiation-free estimation of sagittal spinal alignment. LatXGen tackles two core challenges: (1) inferring sagittal spinal morphology changes from a lateral perspective based on posteroanterior surface geometry, and (2) performing cross-modality translation from RGBD input to the radiographic domain. The framework adopts a dual-stage architecture that progressively estimates lateral spinal structure and synthesizes corresponding radiographs. To enhance anatomical consistency, we introduce an attention-based Fast Fourier Convolution (FFC) module for integrating anatomical features from RGBD images and 3D landmarks, and a Spatial Deformation Network (SDN) to model morphological variations in the lateral view. Additionally, we construct the first large-scale paired dataset for this task, comprising 3,264 RGBD and lateral radiograph pairs. Experimental results demonstrate that LatXGen produces anatomically accurate radiographs and outperforms existing GAN-based methods in both visual fidelity and quantitative metrics. This study offers a promising, radiation-free solution for sagittal spine assessment and advances comprehensive AIS evaluation.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/JBHI.2025.3613010

2509.24165

Country:

Asia > China > Hong Kong (0.05)
Asia > Mongolia (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Teaching AI Stepwise Diagnostic Reasoning with Report-Guided Chain-of-Thought Learning

Luo, Yihong, He, Wenwu, Cui, Zhuo-Xu, Liang, Dong

arXiv.org Artificial IntelligenceSep-9-2025

This study presents DiagCoT, a multi-stage framework that applies supervised fine-tuning to general-purpose vision-language models (VLMs) to emulate radiologists' stepwise diagnostic reasoning using only free-text reports. DiagCoT combines contrastive image-report tuning for domain alignment, chain-of-thought supervision to capture inferential logic, and reinforcement tuning with clinical reward signals to enhance factual accuracy and fluency. On the MIMIC-CXR benchmark, DiagCoT improved zero-shot disease classification AUC from 0.52 to 0.76 (absolute gain of 0.24), pathology grounding mIoU from 0.08 to 0.31 (absolute gain of 0.23), and report generation BLEU from 0.11 to 0.33 (absolute gain of 0.22). It outperformed state-of-the-art models including LLaVA-Med and CXR-LLAVA on long-tailed diseases and external datasets. By converting unstructured clinical narratives into structured supervision, DiagCoT offers a scalable approach for developing interpretable and diagnostically competent AI systems for radiology.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.06409

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Fujian Province > Fuzhou (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback