Tensorize, Factorize and Regularize: Robust Visual Relationship Learning

Seong Jae Hwang, Hyun Woo Kim, Sathya N. Ravi, Maxwell D. Collins, Zirui Tao, Vikas Singh

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)

Abstract

Visual relationships provide higher-level information of objects and their relations in an image - this enables a semantic understanding of the scene and helps downstream applications. Given a set of localized objects in some training data, visual relationship detection seeks to detect the most likely 'relationship' between objects in a given image. While the specific objects may be well represented in training data, their relationships may still be infrequent. The empirical distribution obtained from seeing these relationships in a dataset does not model the underlying distribution well - a serious issue for most learning methods. In this work, we start from a simple multi-relational learning model, which in principle, offers a rich formalization for deriving a strong prior for learning visual relationships. While the inference problem for deriving the regularizer is challenging, our main technical contribution is to show how adapting recent results in numerical linear algebra lead to efficient algorithms for a factorization scheme that yields highly informative priors. The factorization provides sample size bounds for inference (under mild conditions) for the underlying [object, predicate, object] relationship learning task on its own and surprisingly outperforms (in some cases) existing methods even without utilizing visual features. Then, when integrated with an end-to-end architecture for visual relationship detection leveraging image data, we substantially improve the state-of-the-art.

Original languageEnglish
Title of host publicationProceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018
PublisherIEEE Computer Society
Pages1014-1023
Number of pages10
ISBN (Electronic)9781538664209
DOIs
Publication statusPublished - 2018 Dec 14
Externally publishedYes
Event31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018 - Salt Lake City, United States
Duration: 2018 Jun 182018 Jun 22

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919

Conference

Conference31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018
CountryUnited States
CitySalt Lake City
Period18/6/1818/6/22

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition

Fingerprint Dive into the research topics of 'Tensorize, Factorize and Regularize: Robust Visual Relationship Learning'. Together they form a unique fingerprint.

  • Cite this

    Hwang, S. J., Kim, H. W., Ravi, S. N., Collins, M. D., Tao, Z., & Singh, V. (2018). Tensorize, Factorize and Regularize: Robust Visual Relationship Learning. In Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018 (pp. 1014-1023). [8578210] (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). IEEE Computer Society. https://doi.org/10.1109/CVPR.2018.00112