AITopics | real-time semantic segmentation

Collaborating Authors

real-time semantic segmentation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Real-Time Semantic Segmentation on FPGA for Autonomous Vehicles Using LMIINet with the CGRA4ML Framework

Hosseini, Amir Mohammad Khadem, Mirzakuchaki, Sattar

arXiv.org Artificial IntelligenceOct-28-2025

Semantic segmentation has emerged as a fundamental problem in computer vision, gaining particular importance in real-time applications such as autonomous driving. The main challenge is achieving high accuracy while operating under computational and hardware constraints. In this research, we present an FPGA-based implementation of real-time semantic segmentation leveraging the lightweight LMIINet architecture and the Coarse-Grained Reconfigurable Array for Machine Learning (CGRA4ML) hardware framework. The model was trained using Quantization-Aware Training (QAT) with 8-bit precision on the Cityscapes dataset, reducing memory footprint by a factor of four while enabling efficient fixed-point computations. Necessary modifications were applied to adapt the model to CGRA4ML constraints, including simplifying skip connections, employing hardware-friendly operations such as depthwise-separable and 1A-1 convolutions, and redesigning parts of the Flatten Transformer. Our implementation achieves approximately 90% pixel accuracy and 45% mean Intersection-over-Union (mIoU), operating in real-time at 20 frames per second (FPS) with 50.1 ms latency on the ZCU104 FPGA board. The results demonstrate the potential of CGRA4ML, with its flexibility in mapping modern layers and off-chip memory utilization for skip connections, provides a path for implementing advanced semantic segmentation networks on FPGA for real-time applications to outperform traditional GPU solutions in terms of power efficiency while maintaining competitive accuracy. The code for this project is publicly available at https://github.com/STAmirr/ cgra4ml_semantic_segmentation

machine learning, real time system, segmentation, (20 more...)

arXiv.org Artificial Intelligence

2510.22243

Country: Europe (0.28)

Genre: Research Report > New Finding (0.34)

Industry:

Transportation > Ground > Road (0.49)
Automobiles & Trucks (0.49)
Information Technology > Robotics & Automation (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.72)

Add feedback

AASeg: Attention Aware Network for Real Time Semantic Segmentation

Sagar, Abhinav

arXiv.org Artificial IntelligenceJul-8-2025

Semantic segmentation is a fundamental task in computer vision that involves dense pixel-wise classification for scene understanding. Despite significant progress, achieving high accuracy while maintaining real-time performance remains a challenging trade-off, particularly for deployment in resource-constrained or latency-sensitive applications. In this paper, we propose AASeg, a novel Attention-Aware Network for real-time semantic segmentation. AASeg effectively captures both spatial and channel-wise dependencies through lightweight Spatial Attention (SA) and Channel Attention (CA) modules, enabling enhanced feature discrimination without incurring significant computational overhead. To enrich contextual representation, we introduce a Multi-Scale Context (MSC) module that aggregates dense local features across multiple receptive fields. The outputs from attention and context modules are adaptively fused to produce high-resolution segmentation maps. Extensive experiments on Cityscapes, ADE20K, and CamVid demonstrate that AASeg achieves a compelling trade-off between accuracy and efficiency, outperforming prior real-time methods.

machine learning, real time system, segmentation, (15 more...)

arXiv.org Artificial Intelligence

2108.04349

Country: North America > United States > Maryland > Prince George's County > College Park (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

ECMNet:Lightweight Semantic Segmentation with Efficient CNN-Mamba Network

Du, Feixiang, Wu, Shengkun

arXiv.org Artificial IntelligenceJun-11-2025

In the past decade, Convolutional Neural Networks (CNNs) and Transformers have achieved wide applicaiton in semantic segmentation tasks. Although CNNs with Transformer models greatly improve performance, the global context modeling remains inadequate. Recently, Mamba achieved great potential in vision tasks, showing its advantages in modeling long-range dependency. In this paper, we propose a lightweight Efficient CNN-Mamba Network for semantic segmentation, dubbed as ECMNet. ECMNet combines CNN with Mamba skillfully in a capsule-based framework to address their complementary weaknesses. Specifically, We design a Enhanced Dual-Attention Block (EDAB) for lightweight bottleneck. In order to improve the representations ability of feature, We devise a Multi-Scale Attention Unit (MSAU) to integrate multi-scale feature aggregation, spatial aggregation and channel aggregation. Moreover, a Mamba enhanced Feature Fusion Module (FFM) merges diverse level feature, significantly enhancing segmented accuracy. Extensive experiments on two representative datasets demonstrate that the proposed model excels in accuracy and efficiency balance, achieving 70.6% mIoU on Cityscapes and 73.6% mIoU on CamVid test datasets, with 0.87M parameters and 8.27G FLOPs on a single RTX 3090 GPU platform.

artificial intelligence, machine learning, segmentation, (19 more...)

arXiv.org Artificial Intelligence

2506.08629

Genre: Research Report (0.64)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.47)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.97)
Information Technology > Artificial Intelligence > Vision (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

A feature refinement module for light-weight semantic segmentation network

Wang, Zhiyan, Guo, Xin, Wang, Song, Zheng, Peixiao, Qi, Lin

arXiv.org Artificial IntelligenceDec-10-2024

Low computational complexity and high segmentation accuracy are both essential to the real-world semantic segmentation tasks. However, to speed up the model inference, most existing approaches tend to design light-weight networks with a very limited number of parameters, leading to a considerable degradation in accuracy due to the decrease of the representation ability of the networks. To solve the problem, this paper proposes a novel semantic segmentation method to improve the capacity of obtaining semantic information for the light-weight network. Specifically, a feature refinement module (FRM) is proposed to extract semantics from multi-stage feature maps generated by the backbone and capture non-local contextual information by utilizing a transformer block. On Cityscapes and Bdd100K datasets, the experimental results demonstrate that the proposed method achieves a promising trade-off between accuracy and computational cost, especially for Cityscapes test set where 80.4% mIoU is achieved and only 214.82 GFLOPs are required.

artificial intelligence, machine learning, segmentation, (17 more...)

arXiv.org Artificial Intelligence

2412.0867

Country: Asia > China > Henan Province > Zhengzhou (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

RTFormer: Efficient Design for Real-Time Semantic Segmentation with Transformer

Neural Information Processing SystemsOct-10-2024, 13:20:49 GMT

Recently, transformer-based networks have shown impressive results in semantic segmentation. Yet for real-time semantic segmentation, pure CNN-based approaches still dominate in this field, due to the time-consuming computation mechanism of transformer. We propose RTFormer, an efficient dual-resolution transformer for real-time semantic segmenation, which achieves better trade-off between performance and efficiency than CNN-based models. To achieve high inference efficiency on GPU-like devices, our RTFormer leverages GPU-Friendly Attention with linear complexity and discards the multi-head mechanism. Besides, we find that cross-resolution attention is more efficient to gather global context information for high-resolution branch by spreading the high level knowledge learned from low-resolution branch. Extensive experiments on mainstream benchmarks demonstrate the effectiveness of our proposed RTFormer, it achieves state-of-the-art on Cityscapes, CamVid and COCOStuff, and shows promising results on ADE20K.

efficient design, real-time semantic segmentation, rtformer, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Architecture > Real Time Systems (0.92)
Information Technology > Artificial Intelligence (0.71)

Add feedback

S$^2$-FPN: Scale-ware Strip Attention Guided Feature Pyramid Network for Real-time Semantic Segmentation

Elhassan, Mohammed A. M., Yang, Chenhui, Huang, Chenxi, Munea, Tewodros Legesse, Hong, Xin, Adam, Abuzar B. M., Benabid, Amina

arXiv.org Artificial IntelligenceMay-18-2024

Modern high-performance semantic segmentation methods employ a heavy backbone and dilated convolution to extract the relevant feature. Although extracting features with both contextual and semantic information is critical for the segmentation tasks, it brings a memory footprint and high computation cost for real-time applications. This paper presents a new model to achieve a trade-off between accuracy/speed for real-time road scene semantic segmentation. Specifically, we proposed a lightweight model named Scale-aware Strip Attention Guided Feature Pyramid Network (S$^2$-FPN). Our network consists of three main modules: Attention Pyramid Fusion (APF) module, Scale-aware Strip Attention Module (SSAM), and Global Feature Upsample (GFU) module. APF adopts an attention mechanisms to learn discriminative multi-scale features and help close the semantic gap between different levels. APF uses the scale-aware attention to encode global context with vertical stripping operation and models the long-range dependencies, which helps relate pixels with similar semantic label. In addition, APF employs channel-wise reweighting block (CRB) to emphasize the channel features. Finally, the decoder of S$^2$-FPN then adopts GFU, which is used to fuse features from APF and the encoder. Extensive experiments have been conducted on two challenging semantic segmentation benchmarks, which demonstrate that our approach achieves better accuracy/speed trade-off with different model settings. The proposed models have achieved a results of 76.2\%mIoU/87.3FPS, 77.4\%mIoU/67FPS, and 77.8\%mIoU/30.5FPS on Cityscapes dataset, and 69.6\%mIoU,71.0\% mIoU, and 74.2\% mIoU on Camvid dataset. The code for this work will be made available at \url{https://github.com/mohamedac29/S2-FPN

proceedings, segmentation, semantic segmentation, (10 more...)

arXiv.org Artificial Intelligence

2206.07298

Country:

Asia > China > Fujian Province > Xiamen (0.04)
Europe > Luxembourg > Luxembourg Canton > Luxembourg City (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Multi-Level Feature Aggregation and Recursive Alignment Network for Real-Time Semantic Segmentation

Zhang, Yanhua, Zhang, Ke, Wang, Jingyu, Wu, Yulin, Wang, Wuwei

arXiv.org Artificial IntelligenceFeb-3-2024

Real-time semantic segmentation is a crucial research for real-world applications. However, many methods lay particular emphasis on reducing the computational complexity and model size, while largely sacrificing the accuracy. In some scenarios, such as autonomous navigation and driver assistance system, accuracy and speed are equally important. To tackle this problem, we propose a novel Multi-level Feature Aggregation and Recursive Alignment Network (MFARANet), aiming to achieve high segmentation accuracy at real-time inference speed. We employ ResNet-18 as the backbone to ensure efficiency, and propose three core components to compensate for the reduced model capacity due to the shallow backbone. Specifically, we first design Multi-level Feature Aggregation Module (MFAM) to aggregate the hierarchical features in the encoder to each scale to benefit subsequent spatial alignment and multi-scale inference. Then, we build Recursive Alignment Module (RAM) by combining the flow-based alignment module with recursive upsampling architecture for accurate and efficient spatial alignment between multi-scale score maps. Finally, the Adaptive Scores Fusion Module (ASFM) is proposed to adaptively fuse multi-scale scores so that the final prediction can favor objects of multiple scales. Comprehensive experiments on three benchmark datasets including Cityscapes, CamVid and PASCAL-Context show the effectiveness and efficiency of our method. In particular, we achieve a better balance between speed and accuracy than state-of-the-art real-time methods on Cityscapes and CamVid datasets. Code is available at: https://github.com/Yanhua-Zhang/MFARANet.

accuracy, segmentation, semantic segmentation, (16 more...)

arXiv.org Artificial Intelligence

2402.02286

Country:

Asia > China > Shaanxi Province > Xi'an (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Architecture > Real Time Systems (1.00)

Add feedback

Technical Report on Subspace Pyramid Fusion Network for Semantic Segmentation

Elhassan, Mohammed A. M., Yang, Chenhui, Huang, Chenxi, Munea, Tewodros Legesse

arXiv.org Artificial IntelligenceDec-6-2023

The following is a technical report to test the validity of the proposed Subspace Pyramid Fusion Module (SPFM) to capture multi-scale feature representations, which is more useful for semantic segmentation. In this investigation, we have proposed the Efficient Shuffle Attention Module(ESAM) to reconstruct the skip-connections paths by fusing multi-level global context features. Experimental results on two well-known semantic segmentation datasets, including Camvid and Cityscapes, show the effectiveness of our proposed method.

proceedings, segmentation, semantic segmentation, (12 more...)

arXiv.org Artificial Intelligence

2204.01278

Country:

Asia > China > Fujian Province > Xiamen (0.04)
Asia > China > Zhejiang Province (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(2 more...)

Add feedback

Resource Constrained Semantic Segmentation for Waste Sorting

Cascina, Elisa, Pellegrino, Andrea, Tozzi, Lorenzo

arXiv.org Artificial IntelligenceOct-30-2023

This work addresses the need for efficient waste sorting strategies in Materials Recovery Facilities to minimize the environmental impact of rising waste. We propose resource-constrained semantic segmentation models for segmenting recyclable waste in industrial settings. Our goal is to develop models that fit within a 10MB memory constraint, suitable for edge applications with limited processing capacity. We perform the experiments on three networks: ICNet, BiSeNet (Xception39 backbone), and ENet. Given the aforementioned limitation, we implement quantization and pruning techniques on the broader nets, achieving positive results while marginally impacting the Mean IoU metric. Furthermore, we propose a combination of Focal and Lov\'asz loss that addresses the implicit class imbalance resulting in better performance compared with the Cross-entropy loss function.

loss function, segmentation, semantic segmentation, (14 more...)

arXiv.org Artificial Intelligence

2310.19407

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Filters

Collaborating Authors

real-time semantic segmentation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Real-Time Semantic Segmentation on FPGA for Autonomous Vehicles Using LMIINet with the CGRA4ML Framework

30e10e671c5e43edb67eb257abb6c3ea-Paper-Conference.pdf

AASeg: Attention Aware Network for Real Time Semantic Segmentation

ECMNet:Lightweight Semantic Segmentation with Efficient CNN-Mamba Network

A feature refinement module for light-weight semantic segmentation network

RTFormer: Efficient Design for Real-Time Semantic Segmentation with Transformer

S$^2$-FPN: Scale-ware Strip Attention Guided Feature Pyramid Network for Real-time Semantic Segmentation

Multi-Level Feature Aggregation and Recursive Alignment Network for Real-Time Semantic Segmentation

Technical Report on Subspace Pyramid Fusion Network for Semantic Segmentation

Resource Constrained Semantic Segmentation for Waste Sorting