Use of Gaussian Pyramid for Mser Based Text Extraction from Scene Image.
Date of Submission
December 2014
Date of Award
Winter 12-12-2015
Institute Name (Publisher)
Indian Statistical Institute
Document Type
Master's Dissertation
Degree Name
Master of Technology
Subject Name
Computer Science
Department
Computer Vision and Pattern Recognition Unit (CVPR-Kolkata)
Supervisor
Bhattacharya, Ujjwal (CVPR-Kolkata; ISI)
Abstract (Summary of the Work)
The potential of automatic extraction of texts from scene image as an application is ever increasing with the advancement of technology especially after market deluging with smartphones. However, it is a difficult problem considering the enormous variations in lighting conditions, presence of noise etc. in such images. Researchers are now working extensively towards developing a robust strategy for this purpose. A few standard databases of camera captured scene images are now available publicly for reporting the performance of each new strategy. During the last one year we studied several strategies towards the development of a robust method for extraction of scene texts from such camera captured outdoor scenes. In this study, we developed a novel scheme for scene text extraction using Gaussian pyramid decomposition of input image and obtaining Maximally Stable Extremal Regions (MSERs) at each level of the Gaussian pyramid to use information at different scales. We select only a subset of MSERs at each level based on a few commonly used rules. We carefully decided a set of weights for combining the selected MSERs at different levels and formed a combined set of MSERs. These combined MSERs provide the initial guess of possible text regions in the input image, In the next phase, we compute three features such as strong edge, stroke-width and edge gradient for individual MSERs corresponding to the initial guess and designed a rule to discard the non-text MSERs of the combined set. The proposed method is naturally scale-insensitive to a reasonable extent. Moreover, it is script independent. Experimental results on the ICDAR 2003 competition dataset have been obtained. Additionally, we simulated the approach on several outdoor scene images captured locally, which contains Bangla and/or Devanagari texts. Finally, we compared the performance of the proposed method with three other state-of-the-art approaches.
Control Number
ISI-DISS-2014-279
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
DOI
http://dspace.isical.ac.in:8080/jspui/handle/10263/6435
Recommended Citation
Bhattacharyya, Abhidip, "Use of Gaussian Pyramid for Mser Based Text Extraction from Scene Image." (2015). Master’s Dissertations. 260.
https://digitalcommons.isical.ac.in/masters-dissertations/260
Comments
ProQuest Collection ID: http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:28843284