SMFENet: Stage-Aware Multi-Scale Feature Extraction Network for Real-Time Semantic Segmentation of Street Scenes

Ziming Li; Ya Li

doi:10.1109/hdis60872.2023.10499568

ScienceGate Book Chapters

JOURNAL ARTICLE

SMFENet: Stage-Aware Multi-Scale Feature Extraction Network for Real-Time Semantic Segmentation of Street Scenes

Ziming Li Ya Li

Year: 2023 Pages: 170-177

DOI: 10.1109/hdis60872.2023.10499568

Get Full-Text PDF Get Analytical Report

Abstract

In recent years, many segmentation methods based on encoder-decoder structure have realized real-time semantic segmentation by using an improved lightweight classification network as the encoder. However, for the segmentation of complex street scenes, the receptive field is not enough to meet the requirement. To alleviate this issue, several methods incorporate a multi-scale feature extraction module into the encoder to capture varied feature information while expanding the receptive field. However, it is observed that a reasonable multi-scale feature extraction within the decoding stage can achieve better segmentation performance. We believe that different decoding stage has different demand for feature information, and delicately designing multi-scale receptive fields for different decoding stages can not only effectively enhance the semantic understanding of the network, but also reduce the amount of network parameters and redundant computing. Therefore, we propose a series of novel stage-aware multi-scale feature extraction (SMFE) modules. These modules aim to extract multi-scale feature information across various decoding stages by employing distinct combinations of receptive fields. Leveraging the SMFE modules, we design a lightweight and efficient decoder, and the network using this decoder is called SMFENet. Experiments demonstrate that SMFENet strikes an effective balance between speed and accuracy. Utilizing ResNet-18 on a single NVIDIA GeForce 1080Ti GPU, SMFENet achieves 78.2% mIoU with 39 FPS on the Cityscapes dataset at a resolution of 1, 024×2, 048, and 74.8% mIoU with 114.2 FPS on the CamVid dataset at a resolution of 720 × 960.

Keywords:

Computer science Segmentation Feature extraction Artificial intelligence Scale (ratio) Semantic feature Stage (stratigraphy) Feature (linguistics) Image segmentation Extraction (chemistry) Pattern recognition (psychology) Computer vision Cartography Geography Geology

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.30

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Automated Road and Building Extraction

Physical Sciences → Engineering → Ocean Engineering

Video Surveillance and Tracking Methods

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

SMFENet: Stage-Aware Multi-Scale Feature Extraction Network for Real-Time Semantic Segmentation of Street Scenes

Abstract

Metrics

Topics

Related Documents

Stage-Aware Feature Alignment Network for Real-Time Semantic Segmentation of Street Scenes

Multi‐directional feature refinement network for real‐time semantic segmentation in urban street scenes

Exploring Scale-Aware Features for Real-Time Semantic Segmentation of Street Scenes

MSF2Net: multi-stage feature fusion network for real-time semantic segmentation in road scenes

Deep Multi - Resolution Network for Real- Time Semantic Segmentation in Street Scenes