Solving tasks for autonomous road vehicles using computer vision is a dynamic and active research field. However, one aspect of autonomous transportation has received little contributions: the rail domain. In this paper, we introduce the first public dataset for semantic scene understanding for trains and trams: RailSem19. This dataset consists of 8500 annotated short sequences from the ego-perspective of trains, including over 1000 examples with railway crossings and 1200 tram scenes. Since it is the first image dataset targeting the rail domain, a novel label policy has been designed from scratch. It focuses on rail-specific labels not covered by any other datasets. In addition to manual annotations in the form of geometric shapes, we also supply dense pixel-wise semantic labeling. The dense labeling is a semantic-aware combination of (a) the geometric shapes and (b) weakly supervised annotations generated by existing semantic segmentation networks from the road domain. Finally, multiple experiments give a first impression on how the new dataset can be used to improve semantic scene understanding in the rail environment. We present prototypes for the image-based classification of trains, switches, switch states, platforms, buffer stops, rail traffic signs and rail traffic lights. Applying transfer learning, we present an early prototype for pixel-wise semantic segmentation on rail scenes. The resulting predictions show that this new data also significantly improves scene understanding in situations where cars and trains interact.
Wei PanYan JinXiaoyu HouYong Liu
Chao WangPeng WangHao YinXuanying LiYijing WangCheng Dai
Peng LiDezheng ZhangAziguli WulamuXin LiuPeng Chen
Qinfeng ZhuJingjing XiaoLei Fan