Foreground fisher vector

Encoding class-relevant foreground to improve image classification

Yongsheng Pan, Yong Xia, Dinggang Shen

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Image classification is an essential and challenging task in computer vision. Despite its prevalence, the combination of the deep convolutional neural network (DCNN) and the Fisher vector (FV) encoding method has limited performance since the class-irrelevant background used in the traditional FV encoding may result in less discriminative image features. In this paper, we propose the foreground FV (fgFV) encoding algorithm and its fast approximation for image classification. We try to separate implicitly the class-relevant foreground from the class-irrelevant background during the encoding process via tuning the weights of the partial gradients corresponding to each Gaussian component under the supervision of image labels and, then, use only those local descriptors extracted from the class-relevant foreground to estimate FVs. We have evaluated our fgFV against the widely used FV and improved FV (iFV) under the combined DCNN-FV framework and also compared them to several state-of-the-art image classification approaches on ten benchmark image datasets for the recognition of fine-grained natural species and artificial manufactures, categorization of course objects, and classification of scenes. Our results indicate that the proposed fgFV encoding algorithm can construct more discriminative image presentations from local descriptors than FV and iFV, and the combined DCNN-fgFV algorithm can improve the performance of image classification.

Original languageEnglish
Article number8678832
Pages (from-to)4716-4729
Number of pages14
JournalIEEE Transactions on Image Processing
Volume28
Issue number10
DOIs
Publication statusPublished - 2019 Oct 1

Fingerprint

Encoding (symbols)
Image classification
Neural networks
Benchmarking
Computer vision
Labels
Weights and Measures
Tuning

Keywords

  • convolutional neural networks
  • feature encoding
  • foreground Fisher vector
  • Image classification

ASJC Scopus subject areas

  • Software
  • Computer Graphics and Computer-Aided Design

Cite this

Foreground fisher vector : Encoding class-relevant foreground to improve image classification. / Pan, Yongsheng; Xia, Yong; Shen, Dinggang.

In: IEEE Transactions on Image Processing, Vol. 28, No. 10, 8678832, 01.10.2019, p. 4716-4729.

Research output: Contribution to journalArticle

@article{fa5e2aafe897402b87b079c122c6b176,
title = "Foreground fisher vector: Encoding class-relevant foreground to improve image classification",
abstract = "Image classification is an essential and challenging task in computer vision. Despite its prevalence, the combination of the deep convolutional neural network (DCNN) and the Fisher vector (FV) encoding method has limited performance since the class-irrelevant background used in the traditional FV encoding may result in less discriminative image features. In this paper, we propose the foreground FV (fgFV) encoding algorithm and its fast approximation for image classification. We try to separate implicitly the class-relevant foreground from the class-irrelevant background during the encoding process via tuning the weights of the partial gradients corresponding to each Gaussian component under the supervision of image labels and, then, use only those local descriptors extracted from the class-relevant foreground to estimate FVs. We have evaluated our fgFV against the widely used FV and improved FV (iFV) under the combined DCNN-FV framework and also compared them to several state-of-the-art image classification approaches on ten benchmark image datasets for the recognition of fine-grained natural species and artificial manufactures, categorization of course objects, and classification of scenes. Our results indicate that the proposed fgFV encoding algorithm can construct more discriminative image presentations from local descriptors than FV and iFV, and the combined DCNN-fgFV algorithm can improve the performance of image classification.",
keywords = "convolutional neural networks, feature encoding, foreground Fisher vector, Image classification",
author = "Yongsheng Pan and Yong Xia and Dinggang Shen",
year = "2019",
month = "10",
day = "1",
doi = "10.1109/TIP.2019.2908795",
language = "English",
volume = "28",
pages = "4716--4729",
journal = "IEEE Transactions on Image Processing",
issn = "1057-7149",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "10",

}

TY - JOUR

T1 - Foreground fisher vector

T2 - Encoding class-relevant foreground to improve image classification

AU - Pan, Yongsheng

AU - Xia, Yong

AU - Shen, Dinggang

PY - 2019/10/1

Y1 - 2019/10/1

N2 - Image classification is an essential and challenging task in computer vision. Despite its prevalence, the combination of the deep convolutional neural network (DCNN) and the Fisher vector (FV) encoding method has limited performance since the class-irrelevant background used in the traditional FV encoding may result in less discriminative image features. In this paper, we propose the foreground FV (fgFV) encoding algorithm and its fast approximation for image classification. We try to separate implicitly the class-relevant foreground from the class-irrelevant background during the encoding process via tuning the weights of the partial gradients corresponding to each Gaussian component under the supervision of image labels and, then, use only those local descriptors extracted from the class-relevant foreground to estimate FVs. We have evaluated our fgFV against the widely used FV and improved FV (iFV) under the combined DCNN-FV framework and also compared them to several state-of-the-art image classification approaches on ten benchmark image datasets for the recognition of fine-grained natural species and artificial manufactures, categorization of course objects, and classification of scenes. Our results indicate that the proposed fgFV encoding algorithm can construct more discriminative image presentations from local descriptors than FV and iFV, and the combined DCNN-fgFV algorithm can improve the performance of image classification.

AB - Image classification is an essential and challenging task in computer vision. Despite its prevalence, the combination of the deep convolutional neural network (DCNN) and the Fisher vector (FV) encoding method has limited performance since the class-irrelevant background used in the traditional FV encoding may result in less discriminative image features. In this paper, we propose the foreground FV (fgFV) encoding algorithm and its fast approximation for image classification. We try to separate implicitly the class-relevant foreground from the class-irrelevant background during the encoding process via tuning the weights of the partial gradients corresponding to each Gaussian component under the supervision of image labels and, then, use only those local descriptors extracted from the class-relevant foreground to estimate FVs. We have evaluated our fgFV against the widely used FV and improved FV (iFV) under the combined DCNN-FV framework and also compared them to several state-of-the-art image classification approaches on ten benchmark image datasets for the recognition of fine-grained natural species and artificial manufactures, categorization of course objects, and classification of scenes. Our results indicate that the proposed fgFV encoding algorithm can construct more discriminative image presentations from local descriptors than FV and iFV, and the combined DCNN-fgFV algorithm can improve the performance of image classification.

KW - convolutional neural networks

KW - feature encoding

KW - foreground Fisher vector

KW - Image classification

UR - http://www.scopus.com/inward/record.url?scp=85070456147&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85070456147&partnerID=8YFLogxK

U2 - 10.1109/TIP.2019.2908795

DO - 10.1109/TIP.2019.2908795

M3 - Article

VL - 28

SP - 4716

EP - 4729

JO - IEEE Transactions on Image Processing

JF - IEEE Transactions on Image Processing

SN - 1057-7149

IS - 10

M1 - 8678832

ER -