mobilenetv3
- Oceania > Australia (0.04)
- North America > United States (0.04)
- Asia > China (0.04)
Design and Structural Validation of a Micro-UAV with On-Board Dynamic Route Planning
Ravikumar, Inbazhagan, Sundhar, Ram, Vijayakumar, Narendhiran
Micro aerial vehicles are becoming increasingly important in search and rescue operations due to their agility, speed, and ability to access confined spaces o r hazardous areas. However, designing lightweight aerial systems presents significant structural, aerodynamic, and computational challenges. This work addresses two key limitations in many low - cost aerial systems under two kilograms: their lack of structural durability during flight through rough terrains and inability to replan paths dynamically when new victims or obstacles are detected. We present a fully customised drone built from scratch using only commonly available components and materials, emphasising modularity, low cost, and ease of assembly. The structural frame is reinforced with lightweight yet durable materials to withstand impact, while the onboard control system is powered entirely by free, open - source software solutions. The proposed system demonstrates real - time perception and adaptive navigation capabilities without relying on expensive hardware accelerators by offering an affordable and practical solution for real - world search and rescue missions.
- North America > United States > New York (0.05)
- North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
- Asia > India (0.04)
- Information Technology (0.71)
- Aerospace & Defense (0.70)
- Transportation (0.48)
MECKD: Deep Learning-Based Fall Detection in Multilayer Mobile Edge Computing With Knowledge Distillation
Mao, Wei-Lung, Wang, Chun-Chi, Chou, Po-Heng, Liu, Kai-Chun, Tsao, Yu
The rising aging population has increased the importance of fall detection (FD) systems as an assistive technology, where deep learning techniques are widely applied to enhance accuracy. FD systems typically use edge devices (EDs) worn by individuals to collect real-time data, which are transmitted to a cloud center (CC) or processed locally. However, this architecture faces challenges such as a limited ED model size and data transmission latency to the CC. Mobile edge computing (MEC), which allows computations at MEC servers deployed between EDs and CC, has been explored to address these challenges. We propose a multilayer MEC (MLMEC) framework to balance accuracy and latency. The MLMEC splits the architecture into stations, each with a neural network model. If front-end equipment cannot detect falls reliably, data are transmitted to a station with more robust back-end computing. The knowledge distillation (KD) approach was employed to improve front-end detection accuracy by allowing high-power back-end stations to provide additional learning experiences, enhancing precision while reducing latency and processing loads. Simulation results demonstrate that the KD approach improved accuracy by 11.65% on the SisFall dataset and 2.78% on the FallAllD dataset. The MLMEC with KD also reduced the data latency rate by 54.15% on the FallAllD dataset and 46.67% on the SisFall dataset compared to the MLMEC without KD. In summary, the MLMEC FD system exhibits improved accuracy and reduced latency.
- North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
- Asia > Taiwan > Taiwan Province > Taipei (0.05)
- North America > United States > Virginia (0.04)
- (4 more...)
- Information Technology (1.00)
- Health & Medicine (1.00)
- Education (1.00)
Stimulative Training of Residual Networks: A Social Psychology Perspective of Loafing Peng Y e
As shown in Fig. r1, we can see that stimulative training can always improve We further verify it on various residual networks and benchmark datasets. MobileNetV3 are single branch structure. We show the trajectory of training loss and test accuracy when applying stimulative and common training in Fig. r2. In addition, as shown in Fig. r3, the optimal balance coefficients for MobileNetV3 on CIFAR10, MobileNetV3 on CIFAR100 and ResNet50 on CIFAR100 are 5, 10 and 10 respectively. The detailed respective training settings are given as follows.
- Oceania > Australia (0.04)
- North America > United States (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Oceania > Australia (0.14)
- Asia > China > Shanghai > Shanghai (0.04)
- North America > United States (0.04)
Novel Multicolumn Kernel Extreme Learning Machine for Food Detection via Optimal Features from CNN
Tahir, Ghalib Ahmed, Loo, Chu Kiong
Automatic food detection is an emerging topic of interest due to its wide array of applications ranging from detecting food images on social media platforms to filtering non-food photos from the users in dietary assessment apps. Recently, during the COVID-19 pandemic, it has facilitated enforcing an eating ban by automatically detecting eating activities from cameras in public places. Therefore, to tackle the challenge of recognizing food images with high accuracy, we proposed the idea of a hybrid framework for extracting and selecting optimal features from an efficient neural network. There on, a nonlinear classifier is employed to discriminate between linearly inseparable feature vectors with great precision. In line with this idea, our method extracts features from MobileNetV3, selects an optimal subset of attributes by using Shapley Additive ex-Planations (SHAP) values, and exploits kernel extreme learning machine (KELM) due to its nonlinear decision boundary and good generalization ability. However, KELM suffers from the'curse of dimensionality problem' for large datasets due to the complex computation of kernel matrix with large numbers of hidden nodes. We solved this problem by proposing a novel multicolumn kernel extreme learning machine (MCK-ELM) which exploited the k-d tree algorithm to divide data into N subsets and trains separate KELM on each subset of data. Then, the method incorporates KELM classifiers into parallel structures and selects the top k nearest subsets during testing by using the k-d tree search for classifying input instead of the whole network. Experimental results showed the superiority of our method on an integrated set of measures while solving the problem of'curse of dimensionality in KELM for large datasets. Keywords: Multicolumn, Kernel Extreme Learning Machine, MobileNet, Food Detection, Explainable AI, SHAP 1. Introduction Automatic detection of food images applications includes visual-based dietary assessment and detecting eating activities from wearable camera photos. The visual-based dietary assessment method reduces the burden of manually collecting the food data by helping users in refreshing their memory using the food pictures of the previous dietary intake. Filtering of non-food images from users is an essential step in these mHealth apps to ensure relevant images are analyzed.
- Asia > Malaysia > Kuala Lumpur > Kuala Lumpur (0.04)
- North America > United States (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Hybrid Knowledge Transfer through Attention and Logit Distillation for On-Device Vision Systems in Agricultural IoT
Mugisha, Stanley, Kisitu, Rashid, Tushabe, Florence
--Integrating deep learning applications into agricultural IoT systems faces a serious challenge of balancing the high accuracy of Vision Transformers (ViTs) with the efficiency demands of resource-constrained edge devices. Large transformer models like the Swin Transformers excel in plant disease classification by capturing global-local dependencies. Lightweight models such as MobileNetV3 and TinyML would be suitable for on-device inference but lack the required spatial reasoning for fine-grained disease detection. T o bridge this gap, we propose a hybrid knowledge distillation framework that synergistically transfers logit and attention knowledge from a Swin Transformer teacher to a MobileNetV3 student model. Our method includes the introduction of adaptive attention alignment to resolve cross-architecture mismatch (resolution, channels) and a dual-loss function optimizing both class probabilities and spatial focus. On the PlantVillage-T omato dataset (18,160 images), the distilled MobileNetV3 attains 92.4% accuracy relative to 95.9% for Swin-L but at an 95% reduction on PC and 82% in inference latency on IoT devices. Key innovations include IoT - centric validation metrics (13 MB memory, 0.22 GFLOPs) and dynamic resolution-matching attention maps. Comparative experiments show significant improvements over standalone CNNs and prior distillation methods, with a 3.5% accuracy gain over MobileNetV3 baselines. Significantly, this work advances real-time, energy-efficient crop monitoring in precision agriculture and demonstrates how we can attain ViT -level diagnostic precision on edge devices. Code and models will be made available for replication after acceptance. The integration of artificial intelligence (AI) into agricultural vision systems has revolutionized precision farming, enabling real-time crop disease detection, yield prediction, and resource optimization [1]-[4].
- North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
- Europe > Netherlands > Groningen (0.04)
- Europe > Italy (0.04)
- (5 more...)
- Information Technology (1.00)
- Food & Agriculture > Agriculture (1.00)
- Education (1.00)
Time Frequency Analysis of EMG Signal for Gesture Recognition using Fine grained Features
Aarotale, Parshuram N., Rattani, Ajita
Electromyography (EMG) based hand gesture recognition converts forearm muscle activity into control commands for prosthetics, rehabilitation, and human computer interaction. This paper proposes a novel approach to EMG-based hand gesture recognition that uses fine-grained classification and presents XMANet, which unifies low-level local and high level semantic cues through cross layer mutual attention among shallow to deep CNN experts. Using stacked spectrograms and scalograms derived from the Short Time Fourier Transform (STFT) and Wavelet Transform (WT), we benchmark XMANet against ResNet50, DenseNet-121, MobileNetV3, and EfficientNetB0. Experimental results on the Grabmyo dataset indicate that, using STFT, the proposed XMANet model outperforms the baseline ResNet50, EfficientNetB0, MobileNetV3, and DenseNet121 models with improvement of approximately 1.72%, 4.38%, 5.10%, and 2.53%, respectively. When employing the WT approach, improvements of around 1.57%, 1.88%, 1.46%, and 2.05% are observed over the same baselines. Similarly, on the FORS EMG dataset, the XMANet(ResNet50) model using STFT shows an improvement of about 5.04% over the baseline ResNet50. In comparison, the XMANet(DenseNet121) and XMANet(MobileNetV3) models yield enhancements of approximately 4.11% and 2.81%, respectively. Moreover, when using WT, the proposed XMANet achieves gains of around 4.26%, 9.36%, 5.72%, and 6.09% over the baseline ResNet50, DenseNet121, MobileNetV3, and EfficientNetB0 models, respectively. These results confirm that XMANet consistently improves performance across various architectures and signal processing techniques, demonstrating the strong potential of fine grained features for accurate and robust EMG classification.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Kansas (0.04)
- Europe > Italy (0.04)
Revolutionizing Communication with Deep Learning and XAI for Enhanced Arabic Sign Language Recognition
Balat, Mazen, Awaad, Rewaa, Zaky, Ahmed B., Aly, Salah A.
This study introduces an integrated approach to recognizing Arabic Sign Language (ArSL) using state-of-the-art deep learning models such as MobileNetV3, ResNet50, and EfficientNet-B2. These models are further enhanced by explainable AI (XAI) techniques to boost interpretability. The ArSL2018 and RGB Arabic Alphabets Sign Language (AASL) datasets are employed, with EfficientNet-B2 achieving peak accuracies of 99.48\% and 98.99\%, respectively. Key innovations include sophisticated data augmentation methods to mitigate class imbalance, implementation of stratified 5-fold cross-validation for better generalization, and the use of Grad-CAM for clear model decision transparency. The proposed system not only sets new benchmarks in recognition accuracy but also emphasizes interpretability, making it suitable for applications in healthcare, education, and inclusive communication technologies.
- Asia > Middle East > Saudi Arabia > Eastern Province > Khobar (0.14)
- Asia > Middle East > Kuwait > Ahmadi Governorate > Al Ahmadi (0.04)
- Africa > Middle East > Egypt > Giza Governorate > Giza (0.04)
- Africa > Middle East > Egypt > Alexandria Governorate > Alexandria (0.04)
- Health & Medicine (1.00)
- Education > Curriculum > Subject-Specific Education (0.97)
Scalable Forward-Forward Algorithm
We propose a scalable Forward-Forward (FF) algorithm that eliminates the need for backpropagation by training each layer separately. Unlike backpropagation, FF avoids backward gradients and can be more modular and memory efficient, making it appealing for large networks. We extend FF to modern convolutional architectures, such as MobileNetV3 and ResNet18, by introducing a new way to compute losses for convolutional layers. Experiments show that our method achieves performance comparable to standard backpropagation. Furthermore, when we divide the network into blocks, such as the residual blocks in ResNet, and apply backpropagation only within each block, but not across blocks, our hybrid design tends to outperform backpropagation baselines while maintaining a similar training speed. Finally, we present experiments on small datasets and transfer learning that confirm the adaptability of our method.