Parallel Feature Pyramid Network for Object Detection

Seung Wook Kim, Hyong Keun Kook, Jee Young Sun, Mun Cheon Kang, Sung-Jea Ko

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Recently developed object detectors employ a convolutional neural network (CNN) by gradually increasing the number of feature layers with a pyramidal shape instead of using a featurized image pyramid. However, the different abstraction levels of CNN feature layers often limit the detection performance, especially on small objects. To overcome this limitation, we propose a CNN-based object detection architecture, referred to as a parallel feature pyramid (FP) network (PFPNet), where the FP is constructed by widening the network width instead of increasing the network depth. First, we adopt spatial pyramid pooling and some additional feature transformations to generate a pool of feature maps with different sizes. In PFPNet, the additional feature transformation is performed in parallel, which yields the feature maps with similar levels of semantic abstraction across the scales. We then resize the elements of the feature pool to a uniform size and aggregate their contextual information to generate each level of the final FP. The experimental results confirmed that PFPNet increases the performance of the latest version of the single-shot multi-box detector (SSD) by mAP of 6.4% AP and especially, 7.8% AP small on the MS-COCO dataset.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings
EditorsVittorio Ferrari, Cristian Sminchisescu, Martial Hebert, Yair Weiss
PublisherSpringer Verlag
Pages239-256
Number of pages18
ISBN (Print)9783030012274
DOIs
Publication statusPublished - 2018 Jan 1
Event15th European Conference on Computer Vision, ECCV 2018 - Munich, Germany
Duration: 2018 Sep 82018 Sep 14

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11209 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other15th European Conference on Computer Vision, ECCV 2018
CountryGermany
CityMunich
Period18/9/818/9/14

Fingerprint

Object Detection
Pyramid
Neural networks
Detectors
Neural Networks
Semantics
Detector
Pooling
Object detection
Experimental Results

Keywords

  • Feature pyramid
  • Real-time object detection

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Kim, S. W., Kook, H. K., Sun, J. Y., Kang, M. C., & Ko, S-J. (2018). Parallel Feature Pyramid Network for Object Detection. In V. Ferrari, C. Sminchisescu, M. Hebert, & Y. Weiss (Eds.), Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings (pp. 239-256). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11209 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-01228-1_15

Parallel Feature Pyramid Network for Object Detection. / Kim, Seung Wook; Kook, Hyong Keun; Sun, Jee Young; Kang, Mun Cheon; Ko, Sung-Jea.

Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. ed. / Vittorio Ferrari; Cristian Sminchisescu; Martial Hebert; Yair Weiss. Springer Verlag, 2018. p. 239-256 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11209 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kim, SW, Kook, HK, Sun, JY, Kang, MC & Ko, S-J 2018, Parallel Feature Pyramid Network for Object Detection. in V Ferrari, C Sminchisescu, M Hebert & Y Weiss (eds), Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11209 LNCS, Springer Verlag, pp. 239-256, 15th European Conference on Computer Vision, ECCV 2018, Munich, Germany, 18/9/8. https://doi.org/10.1007/978-3-030-01228-1_15
Kim SW, Kook HK, Sun JY, Kang MC, Ko S-J. Parallel Feature Pyramid Network for Object Detection. In Ferrari V, Sminchisescu C, Hebert M, Weiss Y, editors, Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. Springer Verlag. 2018. p. 239-256. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-01228-1_15
Kim, Seung Wook ; Kook, Hyong Keun ; Sun, Jee Young ; Kang, Mun Cheon ; Ko, Sung-Jea. / Parallel Feature Pyramid Network for Object Detection. Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. editor / Vittorio Ferrari ; Cristian Sminchisescu ; Martial Hebert ; Yair Weiss. Springer Verlag, 2018. pp. 239-256 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{0080cc05602a4abd9e3c60314fab9e2e,
title = "Parallel Feature Pyramid Network for Object Detection",
abstract = "Recently developed object detectors employ a convolutional neural network (CNN) by gradually increasing the number of feature layers with a pyramidal shape instead of using a featurized image pyramid. However, the different abstraction levels of CNN feature layers often limit the detection performance, especially on small objects. To overcome this limitation, we propose a CNN-based object detection architecture, referred to as a parallel feature pyramid (FP) network (PFPNet), where the FP is constructed by widening the network width instead of increasing the network depth. First, we adopt spatial pyramid pooling and some additional feature transformations to generate a pool of feature maps with different sizes. In PFPNet, the additional feature transformation is performed in parallel, which yields the feature maps with similar levels of semantic abstraction across the scales. We then resize the elements of the feature pool to a uniform size and aggregate their contextual information to generate each level of the final FP. The experimental results confirmed that PFPNet increases the performance of the latest version of the single-shot multi-box detector (SSD) by mAP of 6.4{\%} AP and especially, 7.8{\%} AP small on the MS-COCO dataset.",
keywords = "Feature pyramid, Real-time object detection",
author = "Kim, {Seung Wook} and Kook, {Hyong Keun} and Sun, {Jee Young} and Kang, {Mun Cheon} and Sung-Jea Ko",
year = "2018",
month = "1",
day = "1",
doi = "10.1007/978-3-030-01228-1_15",
language = "English",
isbn = "9783030012274",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "239--256",
editor = "Vittorio Ferrari and Cristian Sminchisescu and Martial Hebert and Yair Weiss",
booktitle = "Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings",

}

TY - GEN

T1 - Parallel Feature Pyramid Network for Object Detection

AU - Kim, Seung Wook

AU - Kook, Hyong Keun

AU - Sun, Jee Young

AU - Kang, Mun Cheon

AU - Ko, Sung-Jea

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Recently developed object detectors employ a convolutional neural network (CNN) by gradually increasing the number of feature layers with a pyramidal shape instead of using a featurized image pyramid. However, the different abstraction levels of CNN feature layers often limit the detection performance, especially on small objects. To overcome this limitation, we propose a CNN-based object detection architecture, referred to as a parallel feature pyramid (FP) network (PFPNet), where the FP is constructed by widening the network width instead of increasing the network depth. First, we adopt spatial pyramid pooling and some additional feature transformations to generate a pool of feature maps with different sizes. In PFPNet, the additional feature transformation is performed in parallel, which yields the feature maps with similar levels of semantic abstraction across the scales. We then resize the elements of the feature pool to a uniform size and aggregate their contextual information to generate each level of the final FP. The experimental results confirmed that PFPNet increases the performance of the latest version of the single-shot multi-box detector (SSD) by mAP of 6.4% AP and especially, 7.8% AP small on the MS-COCO dataset.

AB - Recently developed object detectors employ a convolutional neural network (CNN) by gradually increasing the number of feature layers with a pyramidal shape instead of using a featurized image pyramid. However, the different abstraction levels of CNN feature layers often limit the detection performance, especially on small objects. To overcome this limitation, we propose a CNN-based object detection architecture, referred to as a parallel feature pyramid (FP) network (PFPNet), where the FP is constructed by widening the network width instead of increasing the network depth. First, we adopt spatial pyramid pooling and some additional feature transformations to generate a pool of feature maps with different sizes. In PFPNet, the additional feature transformation is performed in parallel, which yields the feature maps with similar levels of semantic abstraction across the scales. We then resize the elements of the feature pool to a uniform size and aggregate their contextual information to generate each level of the final FP. The experimental results confirmed that PFPNet increases the performance of the latest version of the single-shot multi-box detector (SSD) by mAP of 6.4% AP and especially, 7.8% AP small on the MS-COCO dataset.

KW - Feature pyramid

KW - Real-time object detection

UR - http://www.scopus.com/inward/record.url?scp=85055106425&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85055106425&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-01228-1_15

DO - 10.1007/978-3-030-01228-1_15

M3 - Conference contribution

AN - SCOPUS:85055106425

SN - 9783030012274

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 239

EP - 256

BT - Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings

A2 - Ferrari, Vittorio

A2 - Sminchisescu, Cristian

A2 - Hebert, Martial

A2 - Weiss, Yair

PB - Springer Verlag

ER -