Multiscale Window Based Vision Transformer for Image Inpainting

Seong-Joo Kim; Jaeyoung Choi; Jae-Young Choi

doi:10.33097/jncta.2023.07.12.2050

ScienceGate Book Chapters

JOURNAL ARTICLE

Multiscale Window Based Vision Transformer for Image Inpainting

Seong-Joo Kim Jaeyoung Choi Jae-Young Choi

Year: 2023 Journal: The Journal of Next-generation Convergence Technology Association Vol: 7 (12)Pages: 2050-2057

DOI: 10.33097/jncta.2023.07.12.2050

Get Full-Text PDF Get Analytical Report

Abstract

효과적인 영상 복원을 달성하려면 모델이 맥락정보(contextual information)를 파악하는 것이 중요하다. 합성곱 신경망(CNN) 기반 알고리즘을 사용한 이전 연구는 장거리 종속성이 부족하여 모델이 상황별 정보를 캡처 할 수 없다는 한계에 직면했다. 이를 해결하기 위해 본 논문에서는 고품질 영상 복원을 위한 다중 차원의 윈도우 (Multi-Scale Window) 기반 비전 트랜스포머 모델을 제안한다. 다중 차원 기반 비전 트랜스포머를 도입함으로써 다양한 창 크기의 영향을 반영하고 이에 따른 맥락정보를 얻을 수 있다. 또한 손실 마스크 업데이트 모듈을 적용 하여 효율적인 연산방식을 도입하였다. 실험 결과에 기반하면 본 논문의 제안 모델은 누락된 영역을 효과적으로 복원하고 벤치마크 데이터셋에서 FID 수치가 다른 최신 복원 모델들의 결과에 비해 약 50%이상 낮은 것으로 확 인 되었다. 이는 제안 복원 모델의 우수성을 검증한다고 할 수 있다.

Keywords:

Inpainting Window (computing) Artificial intelligence Computer vision Computer science Transformer Image (mathematics) Pattern recognition (psychology) Engineering Electrical engineering

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.33

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Industrial Vision Systems and Defect Detection

Physical Sciences → Engineering → Industrial and Manufacturing Engineering

Multiscale Window Based Vision Transformer for Image Inpainting

Abstract

Metrics

Topics

Related Documents

Vision Transformer-Based Image Inpainting Method

A Transformer-Based Cross-Window Aggregated Attentional Image Inpainting Model

InViT: GAN Inversion-Based Vision Transformer for Blind Image Inpainting

Depth Inpainting via Vision Transformer

Image inpainting based on decoupled spatiotemporal transformer