"Offline cursive Bengali word recognition using CNNs with a recurrent m" by Chandranath Adak, Bidyut B. Chaudhuri et al.
 

Offline cursive Bengali word recognition using CNNs with a recurrent model

Document Type

Conference Article

Publication Title

Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR

Abstract

This paper deals with offline handwritten word recognition of a major Indic script: Bengali. Due to the structure of this script, the characters (mostly ortho-syllables) are frequently overlapping and hard to segment, especially when the writing is cursive. Individual character recognition and the combination of outputs can increase the likelihood of errors. Instead, a better approach can be sending the whole word to a suitable recognizer. Here we use the Convolutional Neural Network (CNN) integrated with a recurrent model for this purpose. Long short-term memory blocks are used as hidden units. Also, the CNN-derived features are employed in a recurrent model with a CTC (Connectionist Temporal Classification) layer to get the output. We have tested our method on three datasets: (a) a publicly available dataset, (b) a new dataset generated by our research group and (c) an unconstrained dataset. The dataset (a) contains 17,091 words, while our dataset (b) contains 107,550 number of words in total. In addition to these, the dataset (c) is comprised of 5,223 words. We have compared our results with those of some earlier work in the area and have found improved performance, which is due to the novel integration of CNNs with the recurrent model.

First Page

429

Last Page

434

DOI

10.1109/ICFHR.2016.0086

Publication Date

7-2-2016

Share

COinS