ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth
Abstract
This paper tackles the problem of depth estimation from a single image. Existing work either focuses on generalization performance disregarding metric scale, or state-of-the-art results on specific datasets. We propose the first approach that combines both worlds, leading to a model with excellent generalization performance while maintaining metric scale. Our flagship model ZoeD-M12-NK is pre-trained on 12 datasets using relative depth and fine-tuned on two datasets using metric depth. Without pre-training, we already significantly improve the state of the art on NYU Depth v2. Pre-training on twelve datasets further improves SOTA for a total of 21% in relative absolute error. ZoeD-M12-NK is the first model that can jointly train on multiple datasets without a significant drop in performance and achieve unprecedented zero-shot generalization to eight unseen datasets from both indoor and outdoor domains.
Resources
arXiv: 2302.12288
Citation
@article{bhat2023zoedepth,
title = {{ZoeDepth}: Zero-shot Transfer by Combining Relative and Metric Depth},
author = {Bhat, Shariq Farooq and Birkl, Reiner and Wofk, Diana and Wonka, Peter and M{\"{u}}ller, Matthias},
journal = {arXiv preprint arXiv:2302.12288},
year = {2023}
}