CRD-Fusion

Published:

In this work, we introduced a self-supervised stereo matching pipeline that can be run in real-time. The pipeline consists of two stages: a confidence generation and a self-supervised deep neural network.

This pipeline assumes that an initial disparity map (raw disparity map) is available from either a traditional stereo matching algorithm or from a commercial stereo camera. The confidence generation process first computes a confidence map to quantify the correctness of the raw disparity map. Then, the stereo images, raw disparity map, and confidence map are processed by a deep neural network trained in a self-supervised manner. The final outputs include a high-quality predicted disparity map and a corresponding occlusion mask to indicate occluded regions in the disparity map.

A demo video of this pipeline is shown below. More details can also be found in our paper published at the Conference on Robots and Vision 2022 and our code.