JOURNAL ARTICLE

Using N‐grams for Arabic text searching

Suleiman H. MustafaQasem A. Al‐Radaideh

Year: 2004 Journal:   Journal of the American Society for Information Science and Technology Vol: 55 (11)Pages: 1002-1007   Publisher: Wiley

Abstract

Abstract N‐grams have been widely investigated for a number of text processing and retrieval applications. This article examines the performance of the digram and trigram term conflation techniques in the context of Arabic free text retrieval. It reports the results of using the N‐gram approach for a corpus of thousands of distinct textual words drawn from a number of sources representing various disciplines. The results indicate that the digram method offers a better performance than trigram with respect to conflation precision and conflation recall ratios. In either case, the N‐gram approach does not appear to provide an efficient conflation approach due to the peculiarities imposed by the Arabic infix structure that reduces the rate of correct N‐gram matching.

Keywords:
Conflation Trigram Computer science Natural language processing Context (archaeology) n-gram Artificial intelligence Arabic Recall Information retrieval Matching (statistics) Linguistics Language model Mathematics History Statistics

Metrics

54
Cited By
3.48
FWCI (Field Weighted Citation Impact)
20
Refs
0.93
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Algorithms and Data Compression
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

BOOK-CHAPTER

Using Word N-Grams as Features in Arabic Text Classification

Abdulmohsen Al-ThubaityMuneera AlhoshanItisam Hazzaa

Studies in computational intelligence Year: 2014 Pages: 35-43
JOURNAL ARTICLE

Text Mining Using N-Grams

Matthias SchonlauNick Guenther

Journal:   SSRN Electronic Journal Year: 2016
BOOK-CHAPTER

Probabilistic retrieval of OCR degraded text using N-grams

Stuart M. HardingW. Bruce CroftCarl Weir

Lecture notes in computer science Year: 1997 Pages: 345-359
© 2026 ScienceGate Book Chapters — All rights reserved.