intensity distribution
Evaluating the Clinical Impact of Generative Inpainting on Bone Age Estimation
Matsuoka, Felipe Akio, Farina, Eduardo Moreno J. M., Serpa, Augusto Sarquis, Monteiro, Soraya, Ragazzini, Rodrigo, Abdala, Nitamar, Takahashi, Marcelo Straus, Kitamura, Felipe Campos
Generative foundation models can remove visual artifacts through realistic image inpainting, but their impact on medical AI performance remains uncertain. Pediatric hand radiographs often contain non-anatomical markers, and it is unclear whether inpainting these regions preserves features needed for bone age and gender prediction. To evaluate the clinical reliability of generative model-based inpainting for artifact removal, we used the RSNA Bone Age Challenge dataset, selecting 200 original radiographs and generating 600 inpainted versions with gpt-image-1 using natural language prompts to target non-anatomical artifacts. Downstream performance was assessed with deep learning ensembles for bone age estimation and gender classification, using mean absolute error (MAE) and area under the ROC curve (AUC) as metrics, and pixel intensity distributions to detect structural alterations. Inpainting markedly degraded model performance: bone age MAE increased from 6.26 to 30.11 months, and gender classification AUC decreased from 0.955 to 0.704. Inpainted images displayed pixel-intensity shifts and inconsistencies, indicating structural modifications not corrected by simple calibration. These findings show that, although visually realistic, foundation model-based inpainting can obscure subtle but clinically relevant features and introduce latent bias even when edits are confined to non-diagnostic regions, underscoring the need for rigorous, task-specific validation before integrating such generative tools into clinical AI workflows.
- South America > Brazil > São Paulo (0.06)
- North America > United States > North Carolina (0.04)
- Europe > Belgium > Flanders (0.04)
Random Window Augmentations for Deep Learning Robustness in CT and Liver Tumor Segmentation
Østmo, Eirik A., Wickstrøm, Kristoffer K., Radiya, Keyur, Kampffmeyer, Michael C., Mikalsen, Karl Øyvind, Jenssen, Robert
Abstract--Contrast-enhanced Computed T omography (CT) is important for diagnosis and treatment planning for various medical conditions. Deep learning (DL) based segmentation models may enable automated medical image analysis for detecting and delineating tumors in CT images, thereby reducing clinicians' workload. Achieving generalization capabilities in limited data domains, such as radiology, requires modern DL models to be trained with image augmentation. However, naively applying augmentation methods developed for natural images to CT scans often disregards the nature of the CT modality, where the intensities measure Hounsfield Units (HU) and have important physical meaning. This paper challenges the use of such intensity augmentations for CT imaging and shows that they may lead to artifacts and poor generalization. T o mitigate this, we propose a CT -specific augmentation technique, called Random windowing, that exploits the available HU distribution of intensities in CT images. Random windowing encourages robustness to contrast-enhancement and significantly increases model performance on challenging images with poor contrast or timing. We perform ablations and analysis of our method on multiple datasets, and compare to, and outperform, state-of-the-art alternatives, while focusing on the challenge of liver tumor segmentation. Computed Tomography (CT) is a cornerstone in the diagnosis and treatment planning of various health conditions [1]. In liver applications, contrast-enhanced CT imaging enables precise imaging for detection and delineation of tumors, facilitating effective intervention strategies. With the rapid advancement of Deep Learning (DL), the utilization of computer vision (CV) models has become increasingly prevalent for automating tasks in radiology [2]-[5].
- Europe > Austria > Vienna (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (12 more...)
High Performance Space Debris Tracking in Complex Skylight Backgrounds with a Large-Scale Dataset
Zhuang, Guohang, Song, Weixi, Huang, Jinyang, Yang, Chenwei, OuYang, Wanli, Lu, Yan
With the rapid development of space exploration, space debris has attracted more attention due to its potential extreme threat, leading to the need for real-time and accurate debris tracking. However, existing methods are mainly based on traditional signal processing, which cannot effectively process the complex background and dense space debris. In this paper, we propose a deep learning-based Space Debris Tracking Network~(SDT-Net) to achieve highly accurate debris tracking. SDT-Net effectively represents the feature of debris, enhancing the efficiency and stability of end-to-end model learning. To train and evaluate this model effectively, we also produce a large-scale dataset Space Debris Tracking Dataset (SDTD) by a novel observation-based data simulation scheme. SDTD contains 18,040 video sequences with a total of 62,562 frames and covers 250,000 synthetic space debris. Extensive experiments validate the effectiveness of our model and the challenging of our dataset. Furthermore, we test our model on real data from the Antarctic Station, achieving a MOTA score of 73.2%, which demonstrates its strong transferability to real-world scenarios. Our dataset and code will be released soon.
Diabetic Retinopathy Detection Using CNN with Residual Block with DCGAN
Aronno, Debjany Ghosh, Saeha, Sumaiya
Diabetic Retinopathy (DR) is a major cause of blindness worldwide, caused by damage to the blood vessels in the retina due to diabetes. Early detection and classification of DR are crucial for timely intervention and preventing vision loss. This work proposes an automated system for DR detection using Convolutional Neural Networks (CNNs) with a residual block architecture, which enhances feature extraction and model performance. To further improve the model's robustness, we incorporate advanced data augmentation techniques, specifically leveraging a Deep Convolutional Generative Adversarial Network (DCGAN) for generating diverse retinal images. This approach increases the variability of training data, making the model more generalizable and capable of handling real-world variations in retinal images. The system is designed to classify retinal images into five distinct categories, from No DR to Proliferative DR, providing an efficient and scalable solution for early diagnosis and monitoring of DR progression. The proposed model aims to support healthcare professionals in large-scale DR screening, especially in resource-constrained settings.
- Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.94)
SPACE-SUIT: An Artificial Intelligence based chromospheric feature extractor and classifier for SUIT
Seth, Pranava, Upendran, Vishal, Anand, Megha, Sarkar, Janmejoy, Roy, Soumya, Chaki, Priyadarshan, Chowdhury, Pratyay, Ghosh, Borishan, Tripathi, Durgesh
The Solar Ultraviolet Imaging Telescope(SUIT) onboard Aditya-L1 is an imager that observes the solar photosphere and chromosphere through observations in the wavelength range of 200-400 nm. A comprehensive understanding of the plasma and thermodynamic properties of chromospheric and photospheric morphological structures requires a large sample statistical study, necessitating the development of automatic feature detection methods. To this end, we develop the feature detection algorithm SPACE-SUIT: Solar Phenomena Analysis and Classification using Enhanced vision techniques for SUIT, to detect and classify the solar chromospheric features to be observed from SUIT's Mg II k filter. Specifically, we target plage regions, sunspots, filaments, and off-limb structures. SPACE uses You Only Look Once(YOLO), a neural network-based model to identify regions of interest. We train and validate SPACE using mock-SUIT images developed from Interface Region Imaging Spectrometer(IRIS) full-disk mosaic images in Mg II k line, while we also perform detection on Level-1 SUIT data. SPACE achieves an approximate precision of 0.788, recall 0.863 and MAP of 0.874 on the validation mock SUIT FITS dataset. Given the manual labeling of our dataset, we perform "self-validation" by applying statistical measures and Tamura features on the ground truth and predicted bounding boxes. We find the distributions of entropy, contrast, dissimilarity, and energy to show differences in the features. These differences are qualitatively captured by the detected regions predicted by SPACE and validated with the observed SUIT images, even in the absence of labeled ground truth. This work not only develops a chromospheric feature extractor but also demonstrates the effectiveness of statistical metrics and Tamura features for distinguishing chromospheric features, offering independent validation for future detection schemes.
- Asia > India > West Bengal > Kolkata (0.04)
- Oceania > Australia (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (7 more...)
Recognizing Beam Profiles from Silicon Photonics Gratings using Transformer Model
Lim, Yu Dian, Li, Hong Yu, Goh, Simon Chun Kiat, Wang, Xiangyu, Zhao, Peng, Tan, Chuan Seng
Over the past decade, there has been extensive work in developing integrated silicon photonics (SiPh) gratings for the optical addressing of trapped ion qubits in the ion trap quantum computing community. However, when viewing beam profiles from infrared (IR) cameras, it is often difficult to determine the corresponding heights where the beam profiles are located. In this work, we developed transformer models to recognize the corresponding height categories of beam profiles of light from SiPh gratings. The model is trained using two techniques: (1) input patches, and (2) input sequence. For model trained with input patches, the model achieved recognition accuracy of 0.938. Meanwhile, model trained with input sequence shows lower accuracy of 0.895. However, when repeating the model-training 150 cycles, model trained with input patches shows inconsistent accuracy ranges between 0.445 to 0.959, while model trained with input sequence exhibit higher accuracy values between 0.789 to 0.936. The obtained outcomes can be expanded to various applications, including auto-focusing of light beam and auto-adjustment of z-axis stage to acquire desired beam profiles.
Application of RESNET50 Convolution Neural Network for the Extraction of Optical Parameters in Scattering Media
Deng, Bowen, Zhang, Yihan, Parkes, Andrew, Bentley, Alex, Wright, Amanda, Pound, Michael, Somekh, Michael
Estimation of the optical properties of scattering media such as tissue is important in diagnostics as well as in the development of techniques to image deeper. As light penetrates the sample scattering events occur that alter the propagation direction of the photons in a random manner leading degradation of image quality. The distribution of the scattered light does, however, give a measure of the optical properties such as the reduced scattering coefficient and the absorption coefficient. Unfortunately, inverting scattering patterns to recover the optical properties is not simple, especially in the regime where the light is partially randomized. Machine learning has been proposed by several authors as a means of recovering these properties from either the back scattered or the transmitted light. In the present paper, we train a general purpose convolutional neural network RESNET 50 with simulated data based on Monte Carlo simulations. We show that compared with previous work our approach gives comparable or better reconstruction accuracy with training on a much smaller dataset. Moreover, by training on multiple parameters such as the intensity distribution at multiple planes or the exit angle and spatial distribution one achieves improved performance compared to training on a single input such as the intensity distribution captured at the sample surface. While our approach gives good parameter reconstruction, we identify factors that limit the accuracy of the recovered properties, particularly the absorption coefficient. In the light of these limitations, we suggest how the present approach may be enhanced for even better performance.
Toward Automated Formation of Composite Micro-Structures Using Holographic Optical Tweezers
Zhang, Tommy, Werner, Nicole, Banerjee, Ashis G.
Holographic Optical Tweezers (HOT) are powerful tools that can manipulate micro and nano-scale objects with high accuracy and precision. They are most commonly used for biological applications, such as cellular studies, and more recently, micro-structure assemblies. Automation has been of significant interest in the HOT field, since human-run experiments are time-consuming and require skilled operator(s). Automated HOTs, however, commonly use point traps, which focus high intensity laser light at specific spots in fluid media to attract and move micro-objects. In this paper, we develop a novel automated system of tweezing multiple micro-objects more efficiently using multiplexed optical traps. Multiplexed traps enable the simultaneous trapping of multiple beads in various alternate multiplexing formations, such as annular rings and line patterns. Our automated system is realized by augmenting the capabilities of a commercially available HOT with real-time bead detection and tracking, and wavefront-based path planning. We demonstrate the usefulness of the system by assembling two different composite micro-structures, comprising 5 $\mu m$ polystyrene beads, using both annular and line shaped traps in obstacle-rich environments.
- North America > United States > Washington > King County > Seattle (0.15)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Compute-first optical detection for noise-resilient visual perception
Kim, Jungmin, Yu, Nanfang, Yu, Zongfu
In the context of visual perception, the optical signal from a scene is transferred into the electronic domain by detectors in the form of image data, which are then processed for the extraction of visual information. In noisy and weak-signal environments such as thermal imaging for night vision applications, however, the performance of neural computing tasks faces a significant bottleneck due to the inherent degradation of data quality upon noisy detection. Here, we propose a concept of optical signal processing before detection to address this issue. We demonstrate that spatially redistributing optical signals through a properly designed linear transformer can enhance the detection noise resilience of visual perception tasks, as benchmarked with the MNIST classification. Our idea is supported by a quantitative analysis detailing the relationship between signal concentration and noise robustness, as well as its practical implementation in an incoherent imaging system. This compute-first detection scheme can pave the way for advancing infrared machine vision technologies widely used for industrial and defense applications.
- North America > United States > Wisconsin > Dane County > Madison (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Information Technology > Data Science > Data Quality (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Data-Driven Prediction of Seismic Intensity Distributions Featuring Hybrid Classification-Regression Models
Mizutani, Koyu, Mitarai, Haruki, Miyazaki, Kakeru, Kumano, Soichiro, Yamasaki, Toshihiko
Earthquakes are among the most immediate and deadly natural disasters that humans face. Accurately forecasting the extent of earthquake damage and assessing potential risks can be instrumental in saving numerous lives. In this study, we developed linear regression models capable of predicting seismic intensity distributions based on earthquake parameters: location, depth, and magnitude. Because it is completely data-driven, it can predict intensity distributions without geographical information. The dataset comprises seismic intensity data from earthquakes that occurred in the vicinity of Japan between 1997 and 2020, specifically containing 1,857 instances of earthquakes with a magnitude of 5.0 or greater, sourced from the Japan Meteorological Agency. We trained both regression and classification models and combined them to take advantage of both to create a hybrid model. The proposed model outperformed commonly used Ground Motion Prediction Equations (GMPEs) in terms of the correlation coefficient, F1 score, and MCC. Furthermore, the proposed model can predict even abnormal seismic intensity distributions, a task at conventional GMPEs often struggle.