Suleiman H. MustafaQasem A. Al‐Radaideh
Abstract N‐grams have been widely investigated for a number of text processing and retrieval applications. This article examines the performance of the digram and trigram term conflation techniques in the context of Arabic free text retrieval. It reports the results of using the N‐gram approach for a corpus of thousands of distinct textual words drawn from a number of sources representing various disciplines. The results indicate that the digram method offers a better performance than trigram with respect to conflation precision and conflation recall ratios. In either case, the N‐gram approach does not appear to provide an efficient conflation approach due to the peculiarities imposed by the Arabic infix structure that reduces the rate of correct N‐gram matching.
Abdulmohsen Al-ThubaityMuneera AlhoshanItisam Hazzaa
Matthias SchonlauNick Guenther
Abdelaziz ZitouniAsma DamankeshForoogh BarakatiMaha AtariMohamed K. WatfaFarhad Oroumchian
Malik Daler Ali AwanSikandar AliAli SamadNadeem IqbalMalik Muhammad Saad MissenNiamat Ullah
Stuart M. HardingW. Bruce CroftCarl Weir