Mahdi HamdaniAmr El-Desoky MousaHermann Ney
The use of Language Models (LMs) is a very important component in large and open vocabulary recognition systems. This paper presents an open-vocabulary approach for Arabic handwriting recognition. The proposed approach makes use of Arabic word decomposition based on morphological analysis. The vocabulary is a combination of words and sub-words obtained by the decomposition process. Out Of Vocabulary (OOV) words can be recognized by combining different elements from the lexicon. The recognition system is based on Hidden Markov Models (HMMs) with position and context dependent character models. An n-gram LM trained on the decomposed text is used along with the HMMs during the search. The approach is evaluated using two Arabic handwriting datasets. The open vocabulary approach leads to a significant improvement in the system performance. Two different types experiments for two Arabic handwriting recognition tasks are conducted in this work. The proposed approach for open vocabulary allows to have an absolute improvement of up to 1% in the Word Error Rate (WER) for the constrained task and to keep the same performance of the baseline system for the unconstrained one.
Zouhaira NoubighAnis MezghaniMonji Kherallah
Mahdi HamdaniPatrick DoetschMichał KozielskiAmr El-Desoky MousaHermann Ney
Ibrahim AbdelazizSherif AbdouHassanin M. Al-Barhamtoshy
Ihab AlkhouryAdrià GiménezAlfons Juan
Omar Abou KhaledAly FahmySherif Abdou