JOURNAL ARTICLE

Improving Semantic Image Segmentation by Object Localization

Abstract

Semantic segmentation is about classifying every pixel in an image. In recent years, methods based on Fully Convolutional Networks (FCN) have dominated this field in terms of segmentation accuracy. We are interested in tackling the challenges that these methods are faced with. First, it is expensive to acquire pixel level labels to train the network. Second, FCN often has trouble with data that present imbalanced positive and negative samples. This issue often comes up in domains such as medical imaging and satellite imagery analysis, where the object of interest can be very small. The large number of negative samples can overwhelm the positive samples during training, leading to a biased representation learned by the network. In this thesis, we investigate how an object localization mechanism can address these two challenges. We propose an end-to-end neural network that improves the segmentation accuracy of FCN by incorporating an object localization unit. This network performs object localization first, which is then used as a cue to guide the training of the segmentation network. The two steps share convolutional features. This allows us to leverage object detection labels to help with the training of the segmentation network, alleviating the need for large-scale pixel level labels. To avoid applying max pooling on object proposals that limits the spatial accuracy, we introduce a new type of convolutional layer named ROI convolution. It applies convolution directly on the object proposals in one shot, without the need of passing them individually through the downstream network. We show that this layer is differentiable therefore allowing the network to be trained end-to-end. To demonstrate the efficacy of our method, we apply it to the problem of medical image segmentation. With the object localization unit, our method performs well despite the high class imbalance and it outperforms existing methods on small object segmentation. To understand further about the proposed method and the impact of ROI convolution, we also conducted ablation studies and experimented on an endoscopic image dataset with balanced data.

Keywords:
Artificial intelligence Computer vision Object (grammar) Computer science Segmentation Image segmentation Image (mathematics) Natural language processing

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Image Retrieval and Classification Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Semantic Image Segmentation and Object Labeling

Thanos AthanasiadisPhivos MylonasYannis AvrithisStefanos Kollias

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2007 Vol: 17 (3)Pages: 298-312
JOURNAL ARTICLE

Smooth Attention: Improving Image Semantic Segmentation

K Martynenko BorisKriuk FedorKarthik Periyasamy

Journal:   Global Journal of Computer Science and Technology Year: 2024 Pages: 17-35
JOURNAL ARTICLE

Multi-image object semantic segmentation by fusing segmentation priors

Xuan LiaoJun MiaoJun ChuGuimei Zhang

Journal:   Journal of Image and Graphics Year: 2019 Vol: 24 (6)Pages: 890-901
JOURNAL ARTICLE

Efficient image segmentation for semantic object generation

Xiaotang ChenYinglin Yu

Journal:   Journal of Electronics (China) Year: 2002 Vol: 19 (4)Pages: 420-425
© 2026 ScienceGate Book Chapters — All rights reserved.