Hand-written and machine-printed text classification in architecture, engineering and construction documents
Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR
In AEC (Architecture, Engineering & Construction) industry, drawing documents are used as a blueprint to facilitate the construction process. It is also represented as a graphical language that communicates ideas and information from one mind to another. In AEC documents, text is present in Machine-printed and hand-written format. Since the algorithms for recognition of machine-printed and hand-written texts are different, it is important to distinguish between these two types of texts before sending the document to respective recognition system. In this paper we proposed a novel approach for the classification machine-printed and hand-written text from AEC Documents. Before Classification Hand-Written and Machine-Printed text from the documents our system used some preprocessing which includes binarization, text graphics separation and word segmentation. The Words are segmented based on certain structural properties of Isothetic Covers (IC) tightly enclosing the words in a document. The grid size properties of IC are selected by some statistical analysis of connected component of the document. Then Word level Gabor Filter based features are extracted with spooling information for classification. A standard classifier based on SVM is used to classify the text. This task is performed at word level of AEC documents and we achieved an overall accuracy of 98.45%.
Das, Supriya; Banerjee, Purnendu; Seraogi, Bhagesh; Majumder, Himadri; Mukkamala, Srinivas; Roy, Rahul; and Chaudhuri, Bidyut Baran, "Hand-written and machine-printed text classification in architecture, engineering and construction documents" (2018). Conference Articles. 24.