JOURNAL ARTICLE

SMFENet: Stage-Aware Multi-Scale Feature Extraction Network for Real-Time Semantic Segmentation of Street Scenes

Abstract

In recent years, many segmentation methods based on encoder-decoder structure have realized real-time semantic segmentation by using an improved lightweight classification network as the encoder. However, for the segmentation of complex street scenes, the receptive field is not enough to meet the requirement. To alleviate this issue, several methods incorporate a multi-scale feature extraction module into the encoder to capture varied feature information while expanding the receptive field. However, it is observed that a reasonable multi-scale feature extraction within the decoding stage can achieve better segmentation performance. We believe that different decoding stage has different demand for feature information, and delicately designing multi-scale receptive fields for different decoding stages can not only effectively enhance the semantic understanding of the network, but also reduce the amount of network parameters and redundant computing. Therefore, we propose a series of novel stage-aware multi-scale feature extraction (SMFE) modules. These modules aim to extract multi-scale feature information across various decoding stages by employing distinct combinations of receptive fields. Leveraging the SMFE modules, we design a lightweight and efficient decoder, and the network using this decoder is called SMFENet. Experiments demonstrate that SMFENet strikes an effective balance between speed and accuracy. Utilizing ResNet-18 on a single NVIDIA GeForce 1080Ti GPU, SMFENet achieves 78.2% mIoU with 39 FPS on the Cityscapes dataset at a resolution of 1, 024×2, 048, and 74.8% mIoU with 114.2 FPS on the CamVid dataset at a resolution of 720 × 960.

Keywords:
Computer science Segmentation Feature extraction Artificial intelligence Scale (ratio) Semantic feature Stage (stratigraphy) Feature (linguistics) Image segmentation Extraction (chemistry) Pattern recognition (psychology) Computer vision Cartography Geography Geology

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
26
Refs
0.30
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Automated Road and Building Extraction
Physical Sciences →  Engineering →  Ocean Engineering
Video Surveillance and Tracking Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.