AITopics | cbam

Collaborating Authors

cbam

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Moving object detection from multi-depth images with an attention-enhanced CNN

Shibukawa, Masato, Yoshida, Fumi, Yanagisawa, Toshifumi, Ito, Takashi, Kurosaki, Hirohisa, Yoshikawa, Makoto, Kamiya, Kohki, Jiang, Ji-an, Fraser, Wesley, Kavelaars, JJ, Benecchi, Susan, Verbiscer, Anne, Hatakeyama, Akira, O, Hosei, Ozaki, Naoya

arXiv.org Artificial IntelligenceDec-8-2025

One of the greatest challenges for detecting moving objects in the solar system from wide-field survey data is determining whether a signal indicates a true object or is due to some other source, like noise. Object verification has relied heavily on human eyes, which usually results in significant labor costs. In order to address this limitation and reduce the reliance on manual intervention, we propose a multi-input convolutional neural network integrated with a convolutional block attention module. This method is specifically tailored to enhance the moving object detection system that we have developed and used previously. The current method introduces two innovations. This first one is a multi-input architecture that processes multiple stacked images simultaneously. The second is the incorporation of the convolutional block attention module which enables the model to focus on essential features in both spatial and channel dimensions. These advancements facilitate efficient learning from multiple inputs, leading to more robust detection of moving objects. The performance of the model is evaluated on a dataset consisting of approximately 2,000 observational images. We achieved an accuracy of nearly 99% with AUC (an Area Under the Curve) of >0.99. These metrics indicate that the proposed model achieves excellent classification performance. By adjusting the threshold for object detection, the new model reduces the human workload by more than 99% compared to manual verification.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2512.05415

Country:

Asia (0.96)
North America > United States (0.93)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PECL: A Heterogeneous Parallel Multi-Domain Network for Radar-Based Human Activity Recognition

Yan, Jiuqi, Xu, Chendong, Liu, Dongyu

arXiv.org Artificial IntelligenceNov-10-2025

Abstract--Radar systems are increasingly favored for medical applications because they provide non-intrusive monitoring with high privacy and robustness to lighting conditions. However, existing research typically relies on single-domain radar signals and overlooks the temporal dependencies inherent in human activity, which complicates the classification of similar actions. PECL combines a channel-spatial attention module and temporal units to capture more features and dynamic dependencies during action sequences, improving both accuracy and robustness. The experimental results show that PECL achieves an accuracy of 96.16% on the same dataset, outperforming existing methods by at least 4.78%. PECL also performs best in distinguishing between easily confused actions. Despite its strong performance, PECL maintains moderate model complexity, with 23.42M parameters and 1324.82M Its parameter-efficient design further reduces computational cost. Human activity recognition (HAR) has long been an active research area. With the acceleration of population aging, demand for HAR technology is growing in both hospitals and households [1][2][3].

accuracy, artificial intelligence, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2511.05039

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Enhancing Underwater Object Detection through Spatio-Temporal Analysis and Spatial Attention Networks

Karri, Sai Likhith, Saxena, Ansh

arXiv.org Artificial IntelligenceOct-31-2025

This study examines the effectiveness of spatio-temporal modeling and the integration of spatial attention mechanisms in deep learning models for underwater object detection. Specifically, in the first phase, the performance of temporal-enhanced YOLOv5 variant T-YOLOv5 is evaluated, in comparison with the standard YOLOv5. For the second phase, an augmented version of T-YOLOv5 is developed, through the addition of a Convolutional Block Attention Module (CBAM). By examining the effectiveness of the already pre-existing YOLOv5 and T-YOLOv5 models and of the newly developed T-YOLOv5 with CBAM. With CBAM, the research highlights how temporal modeling improves detection accuracy in dynamic marine environments, particularly under conditions of sudden movements, partial occlusions, and gradual motion. The testing results showed that YOLOv5 achieved a mAP@50-95 of 0.563, while T-YOLOv5 and T-YOLOv5 with CBAM outperformed with mAP@50-95 scores of 0.813 and 0.811, respectively, highlighting their superior accuracy and generalization in detecting complex objects. The findings demonstrate that T-YOLOv5 significantly enhances detection reliability compared to the standard model, while T-YOLOv5 with CBAM further improves performance in challenging scenarios, although there is a loss of accuracy when it comes to simpler scenarios.

artificial intelligence, detection, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2510.25797

Genre: Research Report > New Finding (0.74)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.91)

Add feedback

Fair comparison and ablation study

Neural Information Processing SystemsOct-2-2025, 02:12:05 GMT

The results on CIFAR10 were listed in Table R1. It reveals that HOGA searched by AutoLA (k=4)) still outperforms SE and CBAM by a large margin. We further customized SE and CBAM using the group split operation (denoted by "HOG"), resulting in a specific The HOGA searched by AutoLA outperforms its randomly search counterparts (denoted by "Rand"). We tested the generalization ability of HOGA searched on ResNet56 (denoted by "AutoLA_56") WiderResNet, indicating the consistent superiority of the HOGA searched by AutoLA over previous attention methods. We also compared AutoLA with SE and CBAM on a larger backbone (e.g., The results in Table R3 suggest that AutoLA still outperforms other attention modules.

artificial intelligence, cbam, machine learning, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.30)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Add feedback

Handling imbalance and few-sample size in ML based Onion disease classification

Pal, Abhijeet Manoj, Velmurugan, Rajbabu

arXiv.org Artificial IntelligenceSep-9-2025

Accurate classification of pests and diseases plays a vital role in precision agriculture, enabling efficient identification, targeted interventions, and preventing their further spread. However, current methods primarily focus on binary classification, which limits their practical applications, especially in scenarios where accurately identifying the specific type of disease or pest is essential. We propose a robust deep learning based model for multi-class classification of onion crop diseases and pests. We enhance a pre-trained Convolutional Neural Network (CNN) model by integrating attention based modules and employing comprehensive data augmentation pipeline to mitigate class imbalance. We propose a model which gives 96.90% overall accuracy and 0.96 F1 score on real-world field image dataset. This model gives better results than other approaches using the same datasets.

artificial intelligence, classification, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2509.05341

Country: Asia > India (0.15)

Genre: Research Report (0.41)

Industry: Food & Agriculture > Agriculture (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CBAM-STN-TPS-YOLO: Enhancing Agricultural Object Detection through Spatially Adaptive Attention Mechanisms

Praveen, Satvik, Jung, Yoonsung

arXiv.org Artificial IntelligenceJun-10-2025

Object detection is vital in precision agriculture for plant monitoring, disease detection, and yield estimation. However, models like YOLO struggle with occlusions, irregular structures, and background noise, reducing detection accuracy. While Spatial Transformer Networks (STNs) improve spatial invariance through learned transformations, affine mappings are insufficient for non-rigid deformations such as bent leaves and overlaps. We propose CBAM-STN-TPS-YOLO, a model integrating Thin-Plate Splines (TPS) into STNs for flexible, non-rigid spatial transformations that better align features. Performance is further enhanced by the Convolutional Block Attention Module (CBAM), which suppresses background noise and emphasizes relevant spatial and channel-wise features. On the occlusion-heavy Plant Growth and Phenotyping (PGP) dataset, our model outperforms STN-YOLO in precision, recall, and mAP. It achieves a 12% reduction in false positives, highlighting the benefits of improved spatial flexibility and attention-guided refinement. We also examine the impact of the TPS regularization parameter in balancing transformation smoothness and detection performance. This lightweight model improves spatial awareness and supports real-time edge deployment, making it ideal for smart farming applications requiring accurate and efficient monitoring.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2506.07357

Country: North America > United States > Texas (0.28)

Genre: Research Report > Experimental Study (0.68)

Industry: Food & Agriculture > Agriculture (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

CBAM-EfficientNetV2 for Histopathology Image Classification using Transfer Learning and Dual Attention Mechanisms

Sengodan, Naren

arXiv.org Artificial IntelligenceJan-5-2025

Breast cancer histopathology image classification is critical for early detection and improved patient outcomes. 1 This study introduces a novel approach leveraging EfficientNetV2 models, to improve feature extraction and focus on relevant tissue regions. The proposed models were evaluated on the BreakHis dataset across multiple magnification scales (40X, 100X, 200X, and 400X). 2 Among them, the EfficientNetV2-XL with CBAM achieved outstanding performance, reaching a peak accuracy of 98.96 percent and an F1-score of 98.31 percent at 400X magnification, outperforming state-of-the-art methods. 3 By integrating Contrast Limited Adaptive Histogram Equalization (CLAHE) for preprocessing and optimizing computational efficiency, this method demonstrates its suitability for real-time clinical deployment. 3 The results underscore the potential of attention-enhanced scalable architectures in advancing diagnostic precision for breast cancer detection.

artificial intelligence, classification, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2410.22392

Country: Europe (0.28)

Genre:

Research Report > Promising Solution (0.69)
Research Report > New Finding (0.48)

Industry: Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.72)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

CNNtention: Can CNNs do better with Attention?

Kapila, Nikhil, Glattki, Julian, Rathi, Tejas

arXiv.org Artificial IntelligenceDec-30-2024

Convolutional Neural Networks (CNNs) have been the standard for image classification tasks for a long time, but more recently attention-based mechanisms have gained traction. This project aims to compare traditional CNNs with attention-augmented CNNs across an image classification task. By evaluating and comparing their performance, accuracy and computational efficiency, the project will highlight benefits and trade-off of the localized feature extraction of traditional CNNs and the global context capture in attention-augmented CNNs. By doing this, we can reveal further insights into their respective strengths and weaknesses, guide the selection of models based on specific application needs and ultimately, enhance understanding of these architectures in the deep learning community. This was our final project for CS7643 Deep Learning course at Georgia Tech.

artificial intelligence, cbam, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2412.11657

Genre: Research Report (0.64)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

An Explainable Attention Model for Cervical Precancer Risk Classification using Colposcopic Images

Khare, Smith K., Booth, Berit Bargum, Blanes-Vidal, Victoria, Petersen, Lone Kjeld, Nadimi, Esmaeil S.

arXiv.org Artificial IntelligenceNov-14-2024

Cervical cancer remains a major worldwide health issue, with early identification and risk assessment playing critical roles in effective preventive interventions. This paper presents the Cervix-AID-Net model for cervical precancer risk classification. The study designs and evaluates the proposed Cervix-AID-Net model based on patients colposcopy images. The model comprises a Convolutional Block Attention Module (CBAM) and convolutional layers that extract interpretable and representative features of colposcopic images to distinguish high-risk and low-risk cervical precancer. In addition, the proposed Cervix-AID-Net model integrates four explainable techniques, namely gradient class activation maps, Local Interpretable Model-agnostic Explanations, CartoonX, and pixel rate distortion explanation based on output feature maps and input features. The evaluation using holdout and ten-fold cross-validation techniques yielded a classification accuracy of 99.33\% and 99.81\%. The analysis revealed that CartoonX provides meticulous explanations for the decision of the Cervix-AID-Net model due to its ability to provide the relevant piece-wise smooth part of the image. The effect of Gaussian noise and blur on the input shows that the performance remains unchanged up to Gaussian noise of 3\% and blur of 10\%, while the performance reduces thereafter. A comparison study of the proposed model's performance compared to other deep learning approaches highlights the Cervix-AID-Net model's potential as a supplemental tool for increasing the effectiveness of cervical precancer risk assessment. The proposed method, which incorporates the CBAM and explainable artificial integration, has the potential to influence cervical cancer prevention and early detection, improving patient outcomes and lowering the worldwide burden of this preventable disease.

artificial intelligence, cervix-aid-net model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2411.09469

Country:

Europe > Denmark > Southern Denmark (0.04)
Europe > Sweden (0.04)
Europe > France > Auvergne-Rhône-Alpes > Lyon > Lyon (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Cervical Cancer (1.00)
Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Filters

Collaborating Authors

cbam

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

103303dd56a731e377d01f6a37badae3-AuthorFeedback.pdf

Moving object detection from multi-depth images with an attention-enhanced CNN

PECL: A Heterogeneous Parallel Multi-Domain Network for Radar-Based Human Activity Recognition

Enhancing Underwater Object Detection through Spatio-Temporal Analysis and Spatial Attention Networks

Fair comparison and ablation study

Handling imbalance and few-sample size in ML based Onion disease classification

CBAM-STN-TPS-YOLO: Enhancing Agricultural Object Detection through Spatially Adaptive Attention Mechanisms

CBAM-EfficientNetV2 for Histopathology Image Classification using Transfer Learning and Dual Attention Mechanisms

CNNtention: Can CNNs do better with Attention?

An Explainable Attention Model for Cervical Precancer Risk Classification using Colposcopic Images