Word Sense Disambiguation in Bangla Language Using Supervised Methodology with Necessary Modifications
Article Type
Research Article
Publication Title
Journal of The Institution of Engineers (India): Series B
Abstract
An attempt is made in this paper to report how a supervised methodology has been adopted for the task of word sense disambiguation in Bangla with necessary modifications. At the initial stage, the Naïve Bayes probabilistic model that has been adopted as a baseline method for sense classification, yields moderate result with 81% accuracy when applied on a database of 19 (nineteen) most frequently used Bangla ambiguous words. On experimental basis, the baseline method is modified with two extensions: (a) inclusion of lemmatization process into of the system, and (b) bootstrapping of the operational process. As a result, the level of accuracy of the method is slightly improved up to 84% accuracy, which is a positive signal for the whole process of disambiguation as it opens scope for further modification of the existing method for better result. The data sets that have been used for this experiment include the Bangla POS tagged corpus obtained from the Indian Languages Corpora Initiative, and the Bangla WordNet, an online sense inventory developed at the Indian Statistical Institute, Kolkata. The paper also reports about the challenges and pitfalls of the work that have been closely observed and addressed to achieve expected level of accuracy.
First Page
519
Last Page
526
DOI
10.1007/s40031-018-0337-5
Publication Date
10-1-2018
Recommended Citation
Pal, Alok Ranjan; Pal, Alok Ranjan; Saha, Diganta; and Dash, Niladri Sekhar, "Word Sense Disambiguation in Bangla Language Using Supervised Methodology with Necessary Modifications" (2018). Journal Articles. 1212.
https://digitalcommons.isical.ac.in/journal-articles/1212