Goto

Collaborating Authors

 mobilenet








Efficient Automated Diagnosis of Retinopathy of Prematurity by Customize CNN Models

arXiv.org Artificial Intelligence

This paper encompasses an in-depth examination of Retinopathy of Prematurity (ROP) diagnosis, employing advanced deep learning methodologies. Our focus centers on refining and evaluating CNN-based approaches for precise and efficient ROP detection. We navigate the complexities of dataset curation, preprocessing strategies, and model architecture, aligning with research objectives encompassing model effectiveness, computational cost analysis, and time complexity assessment. Results underscore the supremacy of tailored CNN models over pre-trained counterparts, evident in heightened accuracy and F1-scores. Implementation of a voting system further enhances performance. Additionally, our study reveals the potential of the proposed customized CNN model to alleviate computational burdens associated with deep neural networks. Furthermore, we showcase the feasibility of deploying these models within dedicated software and hardware configurations, highlighting their utility as valuable diagnostic aids in clinical settings. In summary, our discourse significantly contributes to ROP diagnosis, unveiling the efficacy of deep learning models in enhancing diagnostic precision and efficiency.


Data Heterogeneity and Forgotten Labels in Split Federated Learning

arXiv.org Artificial Intelligence

In Split Federated Learning (SFL), the clients collaboratively train a model with the help of a server by splitting the model into two parts. Part-1 is trained locally at each client and aggregated by the aggregator at the end of each round. Part-2 is trained at a server that sequentially processes the intermediate activations received from each client. We study the phenomenon of catastrophic forgetting (CF) in SFL in the presence of data heterogeneity. In detail, due to the nature of SFL, local updates of part-1 may drift away from global optima, while part-2 is sensitive to the processing sequence, similar to forgetting in continual learning (CL). Specifically, we observe that the trained model performs better in classes (labels) seen at the end of the sequence. We investigate this phenomenon with emphasis on key aspects of SFL, such as the processing order at the server and the cut layer. Based on our findings, we propose Hydra, a novel mitigation method inspired by multi-head neural networks and adapted for the SFL's setting. Extensive numerical evaluations show that Hydra outperforms baselines and methods from the literature.


Evaluating the Energy Efficiency of NPU-Accelerated Machine Learning Inference on Embedded Microcontrollers

arXiv.org Artificial Intelligence

The deployment of machine learning (ML) models on microcontrollers (MCUs) is constrained by strict energy, latency, and memory requirements, particularly in battery - operated and real - time edge devices. While software - level optimizations such as quantizatio n and pruning reduce model size and computation, hardware acceleration has emerged as a decisive enabler for efficient embedded inference. This paper evaluates the impact of Neural Processing Units (NPUs) on MCU - based ML execution, using the ARM Cortex - M55 core combined with the Ethos - U55 NPU on the Alif Semiconductor Ensemble E7 development board as a representative platform. A rigorous measurement methodology was employed, incorporating per - inference net energy accounting via GPIO - triggered high - resolutio n digital multimeter synchronization and idle - state subtraction, ensuring accurate attribution of energy costs. Experimental results across six representative ML models -- including MiniResNet, MobileNetV2, FD - MobileNet, MNIST, TinyYolo, and SSD - MobileNet -- dem onstrate substantial efficiency gains when inference is offloaded to the NPU. For moderate to large networks, latency improvements ranged from 7 to over 125, with per - inference net energy reductions up to 143 . Notably, the NPU enabled execution of model s unsupported on CPU - only paths, such as SSD - MobileNet, highlighting its functional as well as efficiency advantages. These findings establish NPUs as a cornerstone of energy - aware embedded AI, enabling real - time, power - constrained ML inference at the MCU level.


Predictive Quality Assessment for Mobile Secure Graphics

arXiv.org Artificial Intelligence

The reliability of secure graphic verification, a key anti-counterfeiting tool, is undermined by poor image acquisition on smartphones. Uncontrolled user captures of these high-entropy patterns cause high false rejection rates, creating a significant'reliability gap'. T o bridge this gap, we depart from traditional perceptual IQA and introduce a framework that predictively estimates a frame's utility for the downstream verification task. W e propose a lightweight model to predict a quality score for a video frame, determining its suitability for a resource-intensive oracle model. Our framework is validated using re-contextualized FNMR and ISRR metrics on a large-scale dataset of 32,000+ images from 105 smartphones. Furthermore, a novel cross-domain analysis on graphics from different industrial printing presses reveals a key finding: a lightweight probe on a frozen, ImageNet-pretrained network generalizes better to an unseen printing technology than a fully fine-tuned model. This provides a key insight for real-world generalization: for domain shifts from physical manufacturing, a frozen general-purpose backbone can be more robust than full fine-tuning, which can overfit to source-domain artifacts.