Semantic Scene Completion Through Context Transformer and Recurrent Convolution

Wenlong Yang; Hongfei Yu; 洋 大草

doi:10.1109/access.2024.3401481

ScienceGate Book Chapters

JOURNAL ARTICLE

Semantic Scene Completion Through Context Transformer and Recurrent Convolution

Wenlong Yang Hongfei Yu 洋大草

Year: 2024 Journal: IEEE Access Vol: 12 Pages: 69700-69709 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/access.2024.3401481

Get Full-Text PDF Get Analytical Report

Abstract

The purpose of monocular semantic scene completion is to predict detailed 3D scene with semantic information using only one image. In order to improve the ability of extracting image features of the classical network and achieve better semantic scene completion effect, we propose a monocular semantic scene completion method based on context transformer and recurrent residual convolution. The context transformer module was added between the encoder and decoder of the image feature extraction network, which uses context information to guide the learning of the dynamic attention matrix and improve the visual representation ability. We also introduce a recurrent residual convolution module into the decoder to accumulate features at different time steps, thus helping to distinguish similar objects. Experimental results show that, on indoor dataset NYUv2 and outdoor traffic scene dataset Semantic KITTI, compared with the baseline method, the evaluation metrics mIoU of the semantic scene completion task is improved by 5% and 8% respectively.

Keywords:

Computer science Artificial intelligence Residual Transformer Monocular Computer vision Encoder Convolution (computer science) Feature extraction Pattern recognition (psychology) Artificial neural network

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.08

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

3D Shape Modeling and Analysis

Physical Sciences → Engineering → Computational Mechanics

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Vision and Imaging

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Semantic Scene Completion Through Context Transformer and Recurrent Convolution

Abstract

Metrics

Topics

Related Documents

Context and Geometry Aware Voxel Transformer for Semantic Scene Completion

Efficient Semantic Scene Completion Network with Spatial Group Convolution

Semantic Scene Completion via Semantic-Aware Guidance and Interactive Refinement Transformer

CVSformer: Cross-View Synthesis Transformer for Semantic Scene Completion

From Front to Rear: 3D Semantic Scene Completion Through Planar Convolution and Attention-Based Network