DenseNet-CTC: An end-to-end RNN-free architecture for context-free string recognition
Computer Vision and Image Understanding
String recognition is one of the challenging tasks in document analysis and recognition areas. Recently, with the surge of interest in end-to-end segmentation-free methods, CRNN (Convolution Recurrent Neural Network), which is a combination of CNN (Convolutional Neural Network) and RNN-CTC (Recurrent Neural Network-Connectionist Temporal Classification), has been widely applied to string recognition. However, in some context-free cases, where a character is followed by arbitrary characters like the digit string, there may be no or very few context links in these strings. In this paper, we propose a new end-to-end RNN-free architecture especially for context-free string recognition and apply it to Handwritten Digit String Recognition (HDSR) task. The proposed architecture is based on CNN and CTC, but without the usage of RNN, and we apply column-wise fully connected layers to connect the convolutional layers and CTC directly. Moreover, to compensate for the possible reduction in modeling capabilities caused by the absence of RNN, we apply densely connected convolutional layers to extract efficient features. We test this new architecture on three public HDSR benchmarks (ORAND-CAR-A, ORAND-CAR-B and CVL HDS) and three other datasets that include a handwritten telephone/postcode dataset PhPAIS and two non-Arabic digit datasets (C-Bangla and C-Hindi). Furthermore, we generate three handwritten digit string datasets to further analyze the influence of RNN. The recognition results on all datasets demonstrate the superiority of the proposed model.
Zhan, Hongjian; Lyu, Shujing; Lu, Yue; and Pal, Umapada, "DenseNet-CTC: An end-to-end RNN-free architecture for context-free string recognition" (2021). Journal Articles. 2059.