Development of a Windows-Based Software for Building of Bengali Database from Scanned Documents (Soft Copy of Bengali Dictionary).
Date of Submission
December 2008
Date of Award
Winter 12-12-2009
Institute Name (Publisher)
Indian Statistical Institute
Document Type
Master's Dissertation
Degree Name
Master of Technology
Subject Name
Computer Science
Department
Applied Statistics Unit (ASU-Kolkata)
Supervisor
Sarkar, Palash (ASU-Kolkata; ISI)
Abstract (Summary of the Work)
To learn a language we always need the help of a dictionary in that language. Even for writing a document in any language it will be advantageous to have a dictionary. In the 21st century we are not using pen and paper for writing something. We are always typing in computer for preparing any document. This is not only true for English, but also for any other languages. All languages have their own dictionaries, but these are printed version in the form of a book. Now there is a need for soft copy of dictionary in each subject. The soft copy of dictionary can be linked to the editor of that language.In this dissertation we try to build a software, which will serve the purpose of a soft copy of Bengali Dictionary like the soft copy of Oxford Dictionary available in English. A small GUI will be displayed on the screen where the user are supposed to write the word whose meaning he/she wants to know. Then if the user press the 'search' button of the GUI, the meaning will be displayed to the user. There will be a option in the GUI such that the user Can hear the pronounciation of the word he/she typed in the text area, The software must provide the facility of cross linking of words L.e. if the user wants to see the meaning of another word which is displayed as a meaning of other word then simply by clicking the mouse over the word be can find the meaning of that word. So there will be no need to write the word again.Till today no soft copy of Bengali Dictionary is available to the best of our knowledge. That's why we are trying to make such a dictionary.1.1 Design of WorkTo make a dictionary we've to first built a Bengali word database. Typing all the words from a bengali dictionary is cumbersome. So our idea is to use a software which can automate the process. For this reason we take help of Optical Charecter Recognition (OCR) software for reocognition of Bengali words. So we thought that preparing for a Bengali Dictionary database would be very casy. Just giving the scanned page of Bengali Dictionary as the input of the OCR software, will give us the soft copy of that page. In this way we can have whole of the dictionary as a soft copy in our machine. But the output of the OCR gives the unstructured TEX representation of the scanned dictionary pages. Now we've developed a program, depending upon the pattern of the TEX output, which will put the TEX output of the OCR into the database in a structured way(i.e. segmenting into different parts). So using OCR we've been able to make a Bengali database.Now comes the second phase of our project i.e. making of a GUI where the user can type his Bengali words to find the meaning of the word which is already in the database and displaying the meaning of the word in bengali fonts to the user. The user write the word in romanized bengali. We have converted this to the corresponding LTEX representation and then applying a search query in the database extracted the meaning of the word. Then TEX representation of the meaning of the word is converted to the core- sponding Bengali fonts and displayed on the screen. User can also hear the pronounciation of his word from this dictionary which will be one of the advantages over hard copy of dictionary.
Control Number
ISI-DISS-2008-155
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
DOI
http://dspace.isical.ac.in:8080/jspui/handle/10263/6324
Recommended Citation
Adhya, Tilak Kumar, "Development of a Windows-Based Software for Building of Bengali Database from Scanned Documents (Soft Copy of Bengali Dictionary)." (2009). Master’s Dissertations. 200.
https://digitalcommons.isical.ac.in/masters-dissertations/200
Comments
ProQuest Collection ID: http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:28843222