Video scene text frames categorization for text detection and recognition
Document Type
Conference Article
Publication Title
Proceedings - International Conference on Pattern Recognition
Abstract
Developing a unified text detection and recognition method is hard for different video types due to varying characteristics in video. This paper proposes a new method for categorizing different types of video text frames, namely, videos containing advertisement, signboard, license plate, front page of book or magazine, street view, and video of general items, for better text detection and recognition rate. We propose symmetry features using gradient vector flow for Canny and Sobel edge images of each input frame to identify candidate edge components. Then for a candidate edge component image, we extract both global and local features using colors from different channels in a new way. Besides, the proposed method extracts statistical and structural features from the spatial distribution of candidate pixels in a multi-scale environment. Lastly, the extracted features are fed to a logistic classifier for categorization. The features extracted locally and globally are tested both separately and altogether in terms of confusion matrix. The performance of the proposed categorization method is evaluated through several text detection and recognition experiments before and after categorization. We noted that the proposed categorization method is very useful in improving text detection and recognition performance.
First Page
3886
Last Page
3891
DOI
10.1109/ICPR.2016.7900241
Publication Date
1-1-2016
Recommended Citation
Qin, Longfei; Shivakumara, Palaiahnakote; Lu, Tong; Pal, Umapada; and Tan, Chew Lim, "Video scene text frames categorization for text detection and recognition" (2016). Conference Articles. 813.
https://digitalcommons.isical.ac.in/conf-articles/813