Fengli ShenZong‐Hui WangZhe‐Ming Lu
As one of the most fundamental tasks in computer vision, semantic segmentation assigns per pixel prediction of object categories. Training a robust model for semantic segmentation is challenging since pixel‐level annotations are expensive to obtain. To alleviate the burden of annotations, the authors propose a weakly‐supervised framework for zero‐shot semantic segmentation, which can segment images having target classes without any pixel‐level labelled instances. Under the assumption that the accessibility to image‐level annotations of target classes does not violate the principle of zero pixel‐level label in zero‐shot semantic segmentation, we utilised image‐level annotations to improve the proposed model's ability to extract pixel‐level features. Furthermore, unlike existing zero‐shot semantic segmentation methods, which use semantic embeddings as class embeddings to transfer knowledge from source classes to target classes, we use image‐level features as their class embeddings to transfer knowledge since the distribution of pixel‐level features is more similar to the distribution of image‐level features rather than the distribution of semantic embeddings. Experimental results on the PASCAL‐VOC data set under different data splits demonstrate that the proposed model achieves promising results.
Fengli ShenZhe‐Ming LuZiqian LuZonghui Wang
Hsiao-Cheng LinJun SuJing-Ming GuoYi‐Chong Zeng
Srinivasa Rao NandamSara AtitoZhenhua FengJosef KittlerMuhammad Awais
Baoxin ZhangXiaopeng WangJinhan CuiJuntao WuXu WangYan LiJinhang LiYunhua TanXiaohong ChenWenliang WuXinghua Yu