JOURNAL ARTICLE

Development of a Bangla Sense Annotated Corpus for Word Sense Disambiguation

Abstract

Sense annotated corpus can be treated as an essential resource for lexicon development, morphological processing and also for evaluating the performance of a word sense disambiguation (WSD) system. In this paper, a Bangla sense annotated corpus is generated from a raw collection of Bangla text, where only the sentences which contain at least one Bangla ambiguous word are retrieved from the raw corpus. All individual word forms of the sentences stored in our Bangla sense annotated corpus are tagged with their corresponding root word forms and POS types and the detected ambiguous words in the sentences are also tagged with their actual senses. The developed Bangla sense annotated corpus initially contains 5028 Bangla sentences with proper annotation and the overall performance of our Bangla sense annotated corpus creation system is 86.95%.

Keywords:
Word-sense disambiguation Bengali Computer science SemEval Natural language processing Sense (electronics) Artificial intelligence Word (group theory) Linguistics WordNet Engineering

Metrics

2
Cited By
0.31
FWCI (Field Weighted Citation Impact)
9
Refs
0.68
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech and dialogue systems
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

A Sense Annotated Corpus for All-Words Urdu Word Sense Disambiguation

Ali SaeedRao Muhammad Adeel NawabMark StevensonPaul Rayson

Journal:   ACM Transactions on Asian and Low-Resource Language Information Processing Year: 2019 Vol: 18 (4)Pages: 1-14
JOURNAL ARTICLE

Building Kashmiri Sense Annotated Corpus and its Usage in Supervised Word Sense Disambiguation

Tawseef Ahmad MirAadil Ahmad LawayeParveen RanaGhayas Ahmed

Journal:   Indian Journal of Science and Technology Year: 2023 Vol: 16 (13)Pages: 1021-1029
JOURNAL ARTICLE

Word Sense Disambiguation Corpus Development for Romanian Language

Liviu Andrei Scutelnicu

Journal:   Procedia Computer Science Year: 2023 Vol: 225 Pages: 822-831
© 2026 ScienceGate Book Chapters — All rights reserved.