Shilpa Vikas ShindeV. S. MalemathJoshi B. VinayakAnilkumar Chandrashekhar KorishettiR. Gogulan
Script identification in multilingual documents remains a critical challenge in document analysis, especially in the context of Indian languages, where multiple scripts often coexist within the same document. This paper proposes MSAFF (Multi-Script Adaptive Feature Fusion), a novel framework designed to tackle this complexity by dynamically integrating multiple discriminative feature sets, namely Local Binary Patterns (LBP), Horizontal Projection Profiles (HPP), and Histogram of Oriented Gradients (HOG). MSAFF employs an adaptive fusion mechanism that intelligently adjusts feature weights according to the granularity of the input, enabling robust script recognition across various textual levels, including blocks, lines, words, numerals, and alphanumeric strings. To effectively classify scripts and manage transitions between them in mixed-script environments, MSAFF utilizes a hybrid classification strategy that combines Support Vector Machines (SVMs) for initial script identification with Hidden Markov Models (HMMs) to model sequential script transitions. Extensive evaluations on the MDIW-13 dataset demonstrate the effectiveness of MSAFF, which achieved an overall accuracy of 92%, with outstanding results in text block-level identification (96%) and mixed-script transition detection (>85%). The method also shows strong resilience to document degradations, maintaining high accuracy under noise (90%) and skew (88%) conditions. Additionally, MSAFF exhibits notable computational efficiency, outperforming state-of-the-art techniques in processing speed across varying input sizes.
Fanchang PengHui MaLi LiuYue LuChing Y. Suen
Kurban UbulGulazat TursunAlimjan AysaDonato ImpedovoGiuseppe PirloIbrahim Yibulayin
Suryakanth Baburao UmmapureG. G. Rajput
Abdelillah SemmaYaâcoub HannadImran SiddiqiSaid LazrakMohamed El Youssfi El Kettani