Detecting faces using deep learning is easy, detecting faces at different scales with the same neural network is hard. Using a novel combination of deep learning and classical learning the authors were able to detect facial regions at multiple scales. This multi-scale method sets a new state-of-the-art on the WIDER FACE dataset.
New State of the Art for Face Detection with Significant Scale Differences
Post · Mar 28, 2017 20:17 · Share on Twitter
Large-scale variations still pose a challenge in unconstrained face detection. To the best of our knowledge, no current face detection algorithm can detect a face as large as 800 x 800 pixels while simultaneously detecting another one as small as 8 x 8 pixels within a single image with equally high accuracy. We propose a two-stage cascaded face detection framework, Multi-Path Region-based Convolutional Neural Network (MP-RCNN), that seamlessly combines a deep neural network with a classic learning strategy, to tackle this challenge. The first stage is a Multi-Path Region Proposal Network (MP-RPN) that proposes faces at three different scales. It simultaneously utilizes three parallel outputs of the convolutional feature maps to predict multi-scale candidate face regions. The \"atrous\" convolution trick (convolution with up-sampled filters) and a newly proposed sampling layer for \"hard\" examples are embedded in MP-RPN to further boost its performance. The second stage is a Boosted Forests classifier, which utilizes deep facial features pooled from inside the candidate face regions as well as deep contextual features pooled from a larger region surrounding the candidate face regions. This step is included to further remove hard negative samples. Experiments show that this approach achieves state-of-the-art face detection performance on the WIDER FACE dataset \"hard\" partition, outperforming the former best result by 9.6% for the Average Precision.