Journal Articles

Zone-based keyword spotting in Bangla and Devanagari documents

Ayan Kumar Bhunia, Institute of Engineering and Management
Partha Pratim Roy, Indian Institute of Technology Roorkee
Aneeshan Sain, Institute of Engineering and Management
Umapada Pal, Indian Statistical Institute, Kolkata

Article Type

Research Article

Publication Title

Multimedia Tools and Applications

Abstract

In this paper, we present a word spotting system in text lines for offline Indic scripts such as Bangla (Bengali) and Devanagari. Recently, it was shown that the zone-wise recognition method improves word recognition performance than the conventional full word recognition system in Indic scripts, like Bangla, Devanagari, Gurumukhi (Roy et al. in Pattern Recogn 60: 1057-1075, 26; Bhunia et al. in Pattern Recogn 79: 12–31, 6). Inspired from this idea we consider the zone segmentation approach and use middle zone information to improve the traditional word spotting performance. To avoid the problem of zone segmentation using heuristic approach, we propose here a new HMM based approach to segment the upper and lower zone components from the text line images. The candidate keywords are searched from a line without segmenting characters or words. Also, we propose a feature combining foreground and background information of text line images for keyword-spotting by character filler models. A significant improvement in performance is noted by using both foreground and background information instead of the individual one. Pyramid Histogram of Oriented Gradient (PHOG) feature has been used in our word spotting framework. From the experiment, it has been noted that the proposed zone-segmentation based system outperforms traditional approaches of word spotting.

First Page

27365

Last Page

27389

DOI

10.1007/s11042-019-08442-y

Publication Date

10-1-2020

Comments

Open Access, Green

Recommended Citation

Bhunia, Ayan Kumar; Roy, Partha Pratim; Sain, Aneeshan; and Pal, Umapada, "Zone-based keyword spotting in Bangla and Devanagari documents" (2020). Journal Articles. 124.
https://digitalcommons.isical.ac.in/journal-articles/124

This document is currently not available here.

COinS

Journal Articles

Zone-based keyword spotting in Bangla and Devanagari documents

Article Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Comments

Recommended Citation

Browse

Search

Author Corner

Links

Journal Articles

Zone-based keyword spotting in Bangla and Devanagari documents

Authors

Article Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Comments

Recommended Citation

Share

Browse

Search

Author Corner

Links