Conference Articles

A New Transformer-Based Approach for Text Detection in Shaky and Non-shaky Day-Night Video

Arnab Halder, Indian Statistical Institute, Kolkata
Palaiahnakote Shivakumara, Universiti Malaya
Umapada Pal, Indian Statistical Institute, Kolkata
Tong Lu, Nanjing University
Michael Blumenstein, University of Technology Sydney

Document Type

Conference Article

Publication Title

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract

Text detection in shaky and non-shaky videos is challenging because of variations caused by day and night videos. In addition, moving objects, vehicles, and humans in the video make the text detection problems more challenging in contrast to text detection in normal natural scene images. Motivated by the capacity of the transformer, we propose a new transformer-based approach for detecting text in both shaky and non-shaky day-night videos. To reduce the effect of object movement, poor quality, and other challenges mentioned above, the proposed work explores temporal frames for obtaining activation frames based on similarity and dissimilarity measures. For estimating similarity and dissimilarity, our method extracts luminance, contrast, and structural features. The activation frames are fed to the transformer which comprises an encoder, decoder, and feed-forward network for text detection in shaky and non-shaky day-night video. Since it is the first work, we create our own dataset for experimentation. To show the effectiveness of the proposed method, experiments are conducted on a standard dataset called the ICDAR-2015 video dataset. The results on our dataset and standard dataset show that the proposed model is superior to state-of-the-art methods in terms of recall, precision, and F-measure.

First Page

Last Page

DOI

10.1007/978-3-031-47637-2_3

Publication Date

1-1-2023

Recommended Citation

Halder, Arnab; Shivakumara, Palaiahnakote; Pal, Umapada; Lu, Tong; and Blumenstein, Michael, "A New Transformer-Based Approach for Text Detection in Shaky and Non-shaky Day-Night Video" (2023). Conference Articles. 554.
https://digitalcommons.isical.ac.in/conf-articles/554

This document is currently not available here.

COinS

Conference Articles

A New Transformer-Based Approach for Text Detection in Shaky and Non-shaky Day-Night Video

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Recommended Citation

Browse

Search

Author Corner

Links

Conference Articles

A New Transformer-Based Approach for Text Detection in Shaky and Non-shaky Day-Night Video

Authors

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Recommended Citation

Share

Browse

Search

Author Corner

Links