Count Objects in an Image with a Mixture of Experts

Mar 29, 2017 17:13

CNN counting MoE stacking

Counting objects in an image is a difficult task due to changes in both density and scale. To address this task, the authors use an ensembling method known as stacking to create a mixture of experts (MoE). Individual CNNs are trained on each sub-problem and the results are “gaited” with a final CNN which selects the best expert for the current sample.

Arxiv Abstract

  • Shohei Kumagai
  • Kazuhiro Hotta
  • Takio Kurita

This paper proposes a crowd counting method. Crowd counting is difficult because of large appearance changes of a target which caused by density and scale changes. Conventional crowd counting methods generally utilize one predictor (e,g., regression and multi-class classifier). However, such only one predictor can not count targets with large appearance changes well. In this paper, we propose to predict the number of targets using multiple CNNs specialized to a specific appearance, and those CNNs are adaptively selected according to the appearance of a test image. By integrating the selected CNNs, the proposed method has the robustness to large appearance changes. In experiments, we confirm that the proposed method can count crowd with lower counting error than a CNN and integration of CNNs with fixed weights. Moreover, we confirm that each predictor automatically specialized to a specific appearance.

