Goto

Collaborating Authors

 digitization


Pic2Diagnosis: A Method for Diagnosis of Cardiovascular Diseases from the Printed ECG Pictures

Büyüksolak, Oğuzhan, Öksüz, İlkay

arXiv.org Artificial Intelligence

The electrocardiogram (ECG) is a vital tool for diagnosing heart diseases. However, many disease patterns are derived from outdated datasets and traditional stepwise algorithms with limited accuracy. This study presents a method for direct cardiovascular disease (CVD) diagnosis from ECG images, eliminating the need for digitization. The proposed approach utilizes a two-step curriculum learning framework, beginning with the pre-training of a classification model on segmentation masks, followed by fine-tuning on grayscale, inverted ECG images. Robustness is further enhanced through an ensemble of three models with averaged outputs, achieving an AUC of 0.9534 and an F1 score of 0.7801 on the BHF ECG Challenge dataset, outperforming individual models. By effectively handling real-world artifacts and simplifying the diagnostic process, this method offers a reliable solution for automated CVD diagnosis, particularly in resource-limited settings where printed or scanned ECG images are commonly used. Such an automated procedure enables rapid and accurate diagnosis, which is critical for timely intervention in CVD cases that often demand urgent care.


Evaluating AI-Driven Automated Map Digitization in QGIS

Febrita, Diana

arXiv.org Artificial Intelligence

Map digitization is an important process that converts maps into digital formats that can be used for further analysis. This process typically requires a deep human involvement because of the need for interpretation and decision-making when translating complex features. With the advancement of artificial intelligence, there is an alternative to conducting map digitization with the help of machine learning techniques. Deepness, or Deep Neural Remote Sensing, is an advanced AI-driven tool designed and integrated as a plugin in QGIS application. This research focuses on assessing the effectiveness of Deepness in automated digitization. This study analyses AI-generated digitization results from Google Earth imagery and compares them with digitized outputs from OpenStreetMap (OSM) to evaluate performance.


Privacy-Aware Federated nnU-Net for ECG Page Digitization

Nemati, Nader

arXiv.org Artificial Intelligence

Deep neural networks can convert ECG page images into analyzable waveforms, yet centralized training often conflicts with cross-institutional privacy and deployment constraints. A cross-silo federated digitization framework is presented that trains a full-model nnU-Net segmentation backbone without sharing images and aggregates updates across sites under realistic non-IID heterogeneity (layout, grid style, scanner profile, noise). The protocol integrates three standard server-side aggregators--FedAvg, FedProx, and FedAdam--and couples secure aggregation with central, user-level differential privacy to align utility with formal guarantees. Key features include: (i) end-to-end full-model training and synchronization across clients; (ii) secure aggregation so the server only observes a clipped, weighted sum once a participation threshold is met; (iii) central Gaussian DP with Renyi accounting applied post-aggregation for auditable user-level privacy; and (iv) a calibration-aware digitization pipeline comprising page normalization, trace segmentation, grid-leakage suppression, and vectorization to twelve-lead signals. Experiments on ECG pages rendered from PTB-XL show consistently faster convergence and higher late-round plateaus with adaptive server updates (FedAdam) relative to FedAvg and FedProx, while approaching centralized performance. The privacy mechanism maintains competitive accuracy while preventing exposure of raw images or per-client updates, yielding deployable, auditable guarantees suitable for multi-institution settings.


Deep Learning-Based Digitization of Overlapping ECG Images with Open-Source Python Code

Karbasi, Reza, Rahimi, Masoud, Vahabie, Abdol-Hossein, Moradi, Hadi

arXiv.org Artificial Intelligence

--This paper addresses the persistent challenge of accurately digitizing paper-based electrocardiogram (ECG) recordings, with a particular focus on robustly handling single leads compromised by signal overlaps--a common yet under-addressed issue in existing methodologies. We propose a two-stage pipeline designed to overcome this limitation. The first stage employs a U-Net based segmentation network, trained on a dataset enriched with overlapping signals and fortified with custom data augmentations, to accurately isolate the primary ECG trace. The subsequent stage converts this refined binary mask into a time-series signal using established digitization techniques, enhanced by an adaptive grid detection module for improved versatility across different ECG formats and scales. Our experimental results demonstrate the efficacy of our approach. The U-Net architecture achieves an Intersection over Union (IoU) of 0.87 for the fine-grained segmentation task. Crucially, our proposed digitization method yields superior performance compared to a well-established baseline technique across both non-overlapping and challenging overlapping ECG samples. For non-overlapping signals, our method achieved a Mean Squared Error (MSE) of 0.0010 and a Pearson Correlation Coefficient ( ρ) of 0.9644, compared to 0.0015 and 0.9366, respectively, for the baseline. On samples with signal overlap, our method achieved an MSE of 0.0029 and a ρ of 0.9641, significantly improving upon the baseline's 0.0178 and 0.8676. This work demonstrates an effective strategy to significantly enhance digitization accuracy, especially in the presence of signal overlaps, thereby laying a strong foundation for the reliable conversion of analog ECG records into analyzable digital data for contemporary research and clinical applications. Electrocardiogram (ECG) serves as a cornerstone in the diagnosis and ongoing monitoring of cardiovascular diseases, which persist as a primary cause of mortality globally [1]. The ability to access and analyze ECG time-series data substantially enhances the efficacy of deep learning-based clinical decision support systems [2].


An Open-Source Python Framework and Synthetic ECG Image Datasets for Digitization, Lead and Lead Name Detection, and Overlapping Signal Segmentation

Rahimi, Masoud, Karbasi, Reza, Vahabie, Abdol-Hossein

arXiv.org Artificial Intelligence

We introduce an open-source Python framework for generating synthetic ECG image datasets to advance critical deep learning-based tasks in ECG analysis, including ECG digitization, lead region and lead name detection, and pixel-level waveform segmentation. Using the PTB-XL signal dataset, our proposed framework produces four open-access datasets: (1) ECG images in various lead configurations paired with time-series signals for ECG digitization, (2) ECG images annotated with YOLO-format bounding boxes for detection of lead region and lead name, (3)-(4) cropped single-lead images with segmentation masks compatible with U-Net-based models in normal and overlapping versions. In the overlapping case, waveforms from neighboring leads are superimposed onto the target lead image, while the segmentation masks remain clean. The open-source Python framework and datasets are publicly available at https://github.com/rezakarbasi/ecg-image-and-signal-dataset and https://doi.org/10.5281/zenodo.15484519, respectively.


ECGtizer: a fully automated digitizing and signal recovery pipeline for electrocardiograms

Lence, Alex, Fall, Ahmad, Cohen, Samuel David, Granese, Federica, Zucker, Jean-Daniel, Salem, Joe-Elie, Prifti, Edi

arXiv.org Artificial Intelligence

Electrocardiograms (ECGs) are essential for diagnosing cardiac pathologies, yet traditional paper-based ECG storage poses significant challenges for automated analysis. This study introduces ECGtizer, an open-source, fully automated tool designed to digitize paper ECGs and recover signals lost during storage. ECGtizer facilitates automated analyses using modern AI methods. It employs automated lead detection, three pixel-based signal extraction algorithms, and a deep learning-based signal reconstruction module. We evaluated ECGtizer on two datasets: a real-life cohort from the COVID-19 pandemic (JOCOVID) and a publicly available dataset (PTB-XL). Performance was compared with two existing methods: the fully automated ECGminer and the semi-automated PaperECG, which requires human intervention. ECGtizer's performance was assessed in terms of signal recovery and the fidelity of clinically relevant feature measurement. Additionally, we tested these tools on a third dataset (GENEREPOL) for downstream AI tasks. Results show that ECGtizer outperforms existing tools, with its ECGtizerFrag algorithm delivering superior signal recovery. While PaperECG demonstrated better outcomes than ECGminer, it required human input. ECGtizer enhances the usability of historical ECG data and supports advanced AI-based diagnostic methods, making it a valuable addition to the field of AI in ECG analysis.


Hespi: A pipeline for automatically detecting information from hebarium specimen sheets

Turnbull, Robert, Fitzgerald, Emily, Thompson, Karen, Birch, Joanne L.

arXiv.org Artificial Intelligence

Specimen associated biodiversity data are sought after for biological, environmental, climate, and conservation sciences. A rate shift is required for the extraction of data from specimen images to eliminate the bottleneck that the reliance on human-mediated transcription of these data represents. We applied advanced computer vision techniques to develop the `Hespi' (HErbarium Specimen sheet PIpeline), which extracts a pre-catalogue subset of collection data on the institutional labels on herbarium specimens from their digital images. The pipeline integrates two object detection models; the first detects bounding boxes around text-based labels and the second detects bounding boxes around text-based data fields on the primary institutional label. The pipeline classifies text-based institutional labels as printed, typed, handwritten, or a combination and applies Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR) for data extraction. The recognized text is then corrected against authoritative databases of taxon names. The extracted text is also corrected with the aide of a multimodal Large Language Model (LLM). Hespi accurately detects and extracts text for test datasets including specimen sheet images from international herbaria. The components of the pipeline are modular and users can train their own models with their own data and use them in place of the models provided.


ECG-Image-Database: A Dataset of ECG Images with Real-World Imaging and Scanning Artifacts; A Foundation for Computerized ECG Image Digitization and Analysis

Reyna, Matthew A., Deepanshi, null, Weigle, James, Koscova, Zuzana, Campbell, Kiersten, Shivashankara, Kshama Kodthalu, Saghafi, Soheil, Nikookar, Sepideh, Motie-Shirazi, Mohsen, Kiarashi, Yashar, Seyedi, Salman, Clifford, Gari D., Sameni, Reza

arXiv.org Artificial Intelligence

We introduce the ECG-Image-Database, a large and diverse collection of electrocardiogram (ECG) images generated from ECG time-series data, with real-world scanning, imaging, and physical artifacts. We used ECG-Image-Kit, an open-source Python toolkit, to generate realistic images of 12-lead ECG printouts from raw ECG time-series. The images include realistic distortions such as noise, wrinkles, stains, and perspective shifts, generated both digitally and physically. The toolkit was applied to 977 12-lead ECG records from the PTB-XL database and 1,000 from Emory Healthcare to create high-fidelity synthetic ECG images. These unique images were subjected to both programmatic distortions using ECG-Image-Kit and physical effects like soaking, staining, and mold growth, followed by scanning and photography under various lighting conditions to create real-world artifacts. The resulting dataset includes 35,595 software-labeled ECG images with a wide range of imaging artifacts and distortions. The dataset provides ground truth time-series data alongside the images, offering a reference for developing machine and deep learning models for ECG digitization and classification. The images vary in quality, from clear scans of clean papers to noisy photographs of degraded papers, enabling the development of more generalizable digitization algorithms. ECG-Image-Database addresses a critical need for digitizing paper-based and non-digital ECGs for computerized analysis, providing a foundation for developing robust machine and deep learning models capable of converting ECG images into time-series. The dataset aims to serve as a reference for ECG digitization and computerized annotation efforts. ECG-Image-Database was used in the PhysioNet Challenge 2024 on ECG image digitization and classification.


Memory-Immersed Collaborative Digitization for Area-Efficient Compute-in-Memory Deep Learning

Nasrin, Shamma, Hashem, Maeesha Binte, Darabi, Nastaran, Parpillon, Benjamin, Fahim, Farah, Gomes, Wilfred, Trivedi, Amit Ranjan

arXiv.org Artificial Intelligence

This work discusses memory-immersed collaborative digitization among compute-in-memory (CiM) arrays to minimize the area overheads of a conventional analog-to-digital converter (ADC) for deep learning inference. Thereby, using the proposed scheme, significantly more CiM arrays can be accommodated within limited footprint designs to improve parallelism and minimize external memory accesses. Under the digitization scheme, CiM arrays exploit their parasitic bit lines to form a within-memory capacitive digital-to-analog converter (DAC) that facilitates area-efficient successive approximation (SA) digitization. CiM arrays collaborate where a proximal array digitizes the analog-domain product-sums when an array computes the scalar product of input and weights. We discuss various networking configurations among CiM arrays where Flash, SA, and their hybrid digitization steps can be efficiently implemented using the proposed memory-immersed scheme. The results are demonstrated using a 65 nm CMOS test chip. Compared to a 40 nm-node 5-bit SAR ADC, our 65 nm design requires $\sim$25$\times$ less area and $\sim$1.4$\times$ less energy by leveraging in-memory computing structures. Compared to a 40 nm-node 5-bit Flash ADC, our design requires $\sim$51$\times$ less area and $\sim$13$\times$ less energy.


Case Study: Embedded Computer Solutions for Seaport Automation

#artificialintelligence

Neousys explains how its range of embedded computer solutions can meet the demands of the seaport automation industry. Neousys products are available in various form factors, from compact to extreme compact, and with options for GPU-aided or non GPU-aided computation. Standard features include compatibility with a range of power and temperature requirements, as well as IP67 waterproofing. For ultra-challenging environments, Neousys also offers extreme rugged computers made from monoblock stainless steel chassis. Read the below case study to find out more.