Abstract
Depth estimation from a single RGB image is a challenging task. It is ill-posed since a single 2D image may correspond to various 3D scenes at different scales. On the other hand, estimating the relative depth relationship between two objects in a scene is easier and may yield more reliable results. Thus, in this paper, we propose a novel algorithm for monocular depth estimation using relative depths. First, using a convolutional neural network, we estimate two types of depths at multiple spatial resolutions: ordinary depth maps and relative depth tensors. Second, we restore a relative depth map from each relative depth tensor. A relative depth map is equivalent to an ordinary depth map with global scale information removed. For the restoration, sparse pairwise comparison matrices are constructed from available relative depths, and missing entries are filled in using the alternative least square (ALS) algorithm. Third, we decompose the ordinary and relative depth maps into components and recombine them to yield a final depth map. To reduce the computational complexity, relative depths at fine spatial resolutions are directly used to refine the final depth map. Extensive experimental results on the NYUv2 dataset demonstrate that the proposed algorithm provides state-of-the-art performance.
Original language | English |
---|---|
Article number | 103459 |
Journal | Journal of Visual Communication and Image Representation |
Volume | 84 |
DOIs | |
Publication status | Published - 2022 Apr |
Keywords
- 3D analysis
- Monocular depth estimation
- Relative depth
ASJC Scopus subject areas
- Signal Processing
- Media Technology
- Computer Vision and Pattern Recognition
- Electrical and Electronic Engineering