Offline cursive Bengali word recognition using CNNs with a recurrent model
Document Type
Conference Article
Publication Title
Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR
Abstract
This paper deals with offline handwritten word recognition of a major Indic script: Bengali. Due to the structure of this script, the characters (mostly ortho-syllables) are frequently overlapping and hard to segment, especially when the writing is cursive. Individual character recognition and the combination of outputs can increase the likelihood of errors. Instead, a better approach can be sending the whole word to a suitable recognizer. Here we use the Convolutional Neural Network (CNN) integrated with a recurrent model for this purpose. Long short-term memory blocks are used as hidden units. Also, the CNN-derived features are employed in a recurrent model with a CTC (Connectionist Temporal Classification) layer to get the output. We have tested our method on three datasets: (a) a publicly available dataset, (b) a new dataset generated by our research group and (c) an unconstrained dataset. The dataset (a) contains 17,091 words, while our dataset (b) contains 107,550 number of words in total. In addition to these, the dataset (c) is comprised of 5,223 words. We have compared our results with those of some earlier work in the area and have found improved performance, which is due to the novel integration of CNNs with the recurrent model.
First Page
429
Last Page
434
DOI
10.1109/ICFHR.2016.0086
Publication Date
7-2-2016
Recommended Citation
Adak, Chandranath; Chaudhuri, Bidyut B.; and Blumenstein, Michael, "Offline cursive Bengali word recognition using CNNs with a recurrent model" (2016). Conference Articles. 761.
https://digitalcommons.isical.ac.in/conf-articles/761