TY - JOUR
T1 - Foreground fisher vector
T2 - Encoding class-relevant foreground to improve image classification
AU - Pan, Yongsheng
AU - Xia, Yong
AU - Shen, Dinggang
N1 - Funding Information:
Manuscript received December 21, 2018; accepted March 18, 2019. Date of publication April 1, 2019; date of current version August 1, 2019. This work was supported in part by the National Natural Science Foundation of China under Grant 61771397 and Grant 61471297 and in part by the Innovation Foundation for Doctor Dissertation of Northwestern Polytechnical University under Grant CX201835. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Emanuele Salerno. (Corresponding authors: Yong Xia; Dinggang Shen.) Y. Pan is with the National Engineering Laboratory for Integrated AeroSpace-Ground-Ocean Big Data Application Technology, School of Computer Science and Engineering, Northwestern Polytechnical University, Xi’an 710072, China, and also with the Department of Radiology and Biomedical Research Imaging Center, The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA (e-mail: yspan@mail.nwpu.edu.cn).
Publisher Copyright:
© 1992-2012 IEEE.
PY - 2019/10
Y1 - 2019/10
N2 - Image classification is an essential and challenging task in computer vision. Despite its prevalence, the combination of the deep convolutional neural network (DCNN) and the Fisher vector (FV) encoding method has limited performance since the class-irrelevant background used in the traditional FV encoding may result in less discriminative image features. In this paper, we propose the foreground FV (fgFV) encoding algorithm and its fast approximation for image classification. We try to separate implicitly the class-relevant foreground from the class-irrelevant background during the encoding process via tuning the weights of the partial gradients corresponding to each Gaussian component under the supervision of image labels and, then, use only those local descriptors extracted from the class-relevant foreground to estimate FVs. We have evaluated our fgFV against the widely used FV and improved FV (iFV) under the combined DCNN-FV framework and also compared them to several state-of-the-art image classification approaches on ten benchmark image datasets for the recognition of fine-grained natural species and artificial manufactures, categorization of course objects, and classification of scenes. Our results indicate that the proposed fgFV encoding algorithm can construct more discriminative image presentations from local descriptors than FV and iFV, and the combined DCNN-fgFV algorithm can improve the performance of image classification.
AB - Image classification is an essential and challenging task in computer vision. Despite its prevalence, the combination of the deep convolutional neural network (DCNN) and the Fisher vector (FV) encoding method has limited performance since the class-irrelevant background used in the traditional FV encoding may result in less discriminative image features. In this paper, we propose the foreground FV (fgFV) encoding algorithm and its fast approximation for image classification. We try to separate implicitly the class-relevant foreground from the class-irrelevant background during the encoding process via tuning the weights of the partial gradients corresponding to each Gaussian component under the supervision of image labels and, then, use only those local descriptors extracted from the class-relevant foreground to estimate FVs. We have evaluated our fgFV against the widely used FV and improved FV (iFV) under the combined DCNN-FV framework and also compared them to several state-of-the-art image classification approaches on ten benchmark image datasets for the recognition of fine-grained natural species and artificial manufactures, categorization of course objects, and classification of scenes. Our results indicate that the proposed fgFV encoding algorithm can construct more discriminative image presentations from local descriptors than FV and iFV, and the combined DCNN-fgFV algorithm can improve the performance of image classification.
KW - Image classification
KW - convolutional neural networks
KW - feature encoding
KW - foreground Fisher vector
UR - http://www.scopus.com/inward/record.url?scp=85070456147&partnerID=8YFLogxK
U2 - 10.1109/TIP.2019.2908795
DO - 10.1109/TIP.2019.2908795
M3 - Article
C2 - 30946666
AN - SCOPUS:85070456147
SN - 1057-7149
VL - 28
SP - 4716
EP - 4729
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
IS - 10
M1 - 8678832
ER -