There exist many texts and symbols in a natural scene, such as billboards and traffic signs, serving the purpose of relaying information or offering guidance. With rapid advances in information technology, detection and extraction of texts in images and related research into this area have become increasingly important. Here, we present an intelligent connected-component based text detection and extraction method involving three steps. First, candidate regions are searched via imaging processing and Canny edge detection. Second, a fast connected component (CC) algorithm enables noise filtering to obtain the candidate texts and their features. Lastly, AdaBoost classifier training is in place to categorize texts or non-text characters for the construction of strong classifiers. This three-step process can effectively filter out non-text CCs for the efficient extraction of text components. The present research integrates CC and AdaBoost algorithms in attaining a 94.65% precision rate for text extraction, which can help facilitate the application and development of text recognition techniques.
S. Kiruthika DeviSubalalitha CN
Sankirti ShiravaleR. JayadevanSanjeev S. Sannakki
Angia Venkatesan KarpagamM. Manikandan
Ujjwal BhattacharyaSwapan Kumar ParuiSrikanta Mondal