A graph-based texts extraction scheme using region similarities is proposed. In graph-based text extraction method, region graph is generated based on the extracted character element regions from edge images. The edge weights of the graph are defined by their representative colors, positions, area sizes and aspect ratios of their bounding boxes. The MST is obtained from the graph, since sets of similar regions are the most significant factors in texts extraction techniques. Then, texts blocks with small edge weights are obtained from the MST. Finally, texts regions which satisfy texts conditions are extracted from these subsets. The proposed method extracts texts under complex shadows and low contrast images.
Ranjit GhoshalAnandarup RoyTapan Kumar BhowmikSwapan K. Parui