Granulated RCNN and Multi-Class Deep SORT for Multi-Object Detection and Tracking

Article Type

Research Article

Publication Title

IEEE Transactions on Emerging Topics in Computational Intelligence


In this article, two new models, namely granulated RCNN (G-RCNN) and multi-class deep SORT (MCD-SORT), for object detection and tracking, respectively from videos are developed. Object detection has two stages: object localization (region of interest RoI) and classification. G-RCNN is an improved version of the well-known Fast RCNN and Faster RCNN for extracting RoIs by incorporating the unique concept of granulation in a deep convolutional neural network. Granulation with spatio-temporal information enables more accurate extraction of RoIs (object regions) in unsupervised mode. Compared to Fast and Faster RCNNs, G-RCNN uses (i) granules (clusters) formed over the pooling feature map, instead of its all feature values, in defining RoIs, (ii) only the positive RoIs during training, instead of the whole RoI-map, (iii) videos directly as input, rather than static images, and (iv) only the objects in RoIs, instead of the entire feature map, for performing object classification. All these lead to the improvement in real-time detection speed and accuracy. MCD-SORT is an advanced form of the popular Deep SORT. In MCD-SORT, the searching for association of objects with trajectories is restricted only within the same categories. This increases the performance in multi-class tracking. These characteristic features have been demonstrated over 37 videos containing single-class, two-class, and multi-class objects. Superiority of the models over several state-of-the-art methodologies is also established extensively, both qualitatively and quantitatively.

First Page


Last Page




Publication Date


This document is currently not available here.