Text component reconstruction for tracking in video
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Text tracking is challenging due to unpredictable variations in orientation, shape, size, color and loss of information. This paper presents a new method for reconstructing text components especially from multi-views for tracking. Our first step is to find Text Candidates (TCs) from multi-views by exploring deep learning. Text candidates are then verified with the degree of similarity and dissimilarity estimated by SIFT feature to eliminate false text candidates, which results in Potential Text Candidates (PTCs). Potential text candidates are further aligned in standard format with the help of affine transform. Next, the proposed method uses mosaicing concept for stitching PTC from multi-views based on overlapping regions between PTC, which results in reconstructed images. Experimental results on a large dataset with multi-view images show that the proposed method is effective and useful. The recognition experiments of several recognition methods show that the performances of the recognition methods improve significantly for the reconstructed images compared to prior reconstruction results.
Yuan, Minglei; Shivakumara, Palaiahnakote; Kong, Hao; Lu, Tong; and Pal, Umapada, "Text component reconstruction for tracking in video" (2018). Conference Articles. 133.