Archivio (RGB-MS Dataset Labeled)
Licenza: Creative Commons: Attribuzione - Non Commerciale 4.0 (CC BY-NC 4.0) Download (511MB) |
|
Archivio (RGB-MS Dataset Unlabeled)
Licenza: Creative Commons: Attribuzione - Non Commerciale 4.0 (CC BY-NC 4.0) Download (81GB) |
|
Documento di testo(rtf) (README)
Licenza: Creative Commons: Attribuzione - Non Commerciale 4.0 (CC BY-NC 4.0) Download (121kB) |
Abstract
We address the problem of registering synchronized color (RGB) and multi-spectral (MS) images featuring very different resolution by solving stereo matching correspondences. Purposely, we introduce a novel RGB-MS dataset framing 13 different scenes in indoor environments and providing a total of 34 image pairs annotated with semi-dense, high-resolution ground-truth labels in the form of disparity maps. To tackle the task, we propose a deep learning architecture trained in a self-supervised manner by exploiting a further RGB camera required only during training data acquisition. In this setup, we can conveniently learn cross-modal matching in the absence of ground-truth labels by distilling knowledge from an easier RGB-RGB matching task based on a collection of about 11K unlabeled image triplets. Experiments show that the proposed pipeline sets a good performance bar (1.16 pixels average registration error) for future research on this novel, challenging task.