Not too deep CNN for face detection in real life scenario

Document Type

Conference Article

Publication Title

Communications in Computer and Information Science


This article presents our recent study of a moderately deep neural network architecture for detection of faces of widely variable sizes and orientations. One of the goals of this work is to achieve sufficiently low latency and acceptable true detection rates on low resolution video or still image data. Several attempts over the years have been made to design a robust and generic face detection system. But due to the inherent complexity of the problem, localization of face in complex and low quality images still remains an open problem. Moreover, the existing state-of-the-art systems usually involve very large network architectures requiring significantly high computational resources for their training. Typical challenges with this data include visual variations due to lighting condition, facial expression, occlusion etc. In the present work, we have designed a moderately deep architecture of Convolutional Neural Network (CNN) suitable for its use on commonly available computing devices. Also, we have proposed some simple strategies for calibration of bounding box that is trained to localize a face even in poor lighting condition and various typical occlusion scenarios. The CNN of the proposed framework receives an input image at three different resolutions to detect faces of various sizes. Simulation results of the proposed approach on publicly available “WIDER FACE” database and another database of 27,576 images/video frames collected by us establish its effectiveness in certain real life scenarios.

First Page


Last Page




Publication Date


This document is currently not available here.