Frame selection for OCR from video stream of book flipping
Multimedia Tools and Applications
Optical Character Recognition (OCR) in video stream of flipping pages is a challenging task because flipping at random speed causes difficulties in identifying the frames that contain the open page image (OPI). Also, low resolution, blurring effect, shadow, etc., add significant noise in selection of proper frames for OCR. In this paper, we focus on identifying a set of representative frames from the video stream of flipping pages without using any explicit hardware and then perform OCR on these frames for recognition. Thus, an end-to-end solution is proposed for video stream of flipping pages. To select an OPI, we present an efficient algorithm that exploits cues from edge information during flipping event. These cues, extracted from the region of interest (ROI) of the frame, determine the flipping or open state of a page. The open state classification is performed by an SVM classifier following training of the edge cue information. After selecting a set of frames for each OPI, a representative frame from OPI set is chosen for OCR. Experiments are performed on videos captured using standard resolution camera. We have obtained 88.81 % accuracy on representative frame selection from the proposed method whereas when compared with GIST (Oliva and Torralba, Int J Comput Vis 42(3):145–175 (2001)), the accuracy was only 51.28 %. To the best of our knowledge this is the first work in this area. After frame selection, we have achieved 83.31 % character recognition accuracy and 78.11 % word recognition accuracy with traditional OCR in our dataset of flipping book.
Chakraborty, Dibyayan; Roy, Partha Pratim; Saini, Rajkumar; Alvarez, Jose M.; and Pal, Umapada, "Frame selection for OCR from video stream of book flipping" (2018). Journal Articles. 1629.