In this paper, the procedure regarding point cloud generation of urban scenes, with images from the nadir RGB camera, is described in detail. To produce dense point clouds three main steps are necessary: generation of disparity maps, creation of depth maps, and calculation of world coordinates (X, Y, and Z).
To create disparity maps, two adjacent images (stereopair) were rectified. Afterwards, the PatchMatch Stereo (PMS) algorithm for 3D reconstruction was executed, since it is easy to implement and provides good results according to the Middlebury Computer Vision dataset. Some steps were parallelized to optimize execution speed. Since depth is inversely proportional to disparity, depth maps were calculated from disparity maps. The height of scene elements Z was obtained by subtracting their depth to the camera height.
To calculate the remaining world coordinates X and Y, the back-projection equation and the camera intrinsic and extrinsic parameters were used. To validate the PMS algorithm, its resulting point cloud was compared with a LiDAR point cloud and a PhotoScan point cloud. The root mean square errors of both comparisons showed similar values.