Technical Specs
5 The rc_visard in a nutshell
The rc_visard is a self-registering 3D camera. It provides rectified camera, disparity, confidence, and error images,
which enable the viewed scene’s depth values along with their uncertainties to be computed. Furthermore, the
motion of visual features in the images is combined with acceleration and turn-rate measurements at a high rate,
which enables the sensor to provide real-time estimates of its current pose, velocity, and acceleration.
5.1 Stereo vision
The rc_visard is based on stereo vision using the SGM (Semi-Global Matching) method. In stereo vision, 3D
information about a scene can be extracted by comparing two images taken from different viewpoints. The main
idea behind using a camera pair for measuring depth is the fact that object points appear at different positions in
the two camera images depending on their distance from the camera pair. Very distant object points appear at
approximately the same position in both images, whereas very close object points occupy different positions in
the left and right camera image. The object points’ displacement in the two images is called disparity. The larger
the disparity, the closer the object is to the camera. The principle is illustrated in Fig. 5.1.1.
Image plane
Left camera
Right camera
Left image
Right image
d
1
d
2
Fig. 5.1.1: Sketch of the stereo-vision principle: The more distant object (black) exhibits a smaller disparity 𝑑
2
than that of the close object (gray), 𝑑
1
.
Stereo vision is a form of passive sensing, meaning that it emits neither light nor other signals to measure distances,
but uses only light that the environment emits or reflects. The rc_visard can thus work indoors and outdoors and
multiple rc_visard devices can work together without interferences.
To compute the 3D information, the stereo matching algorithm must be able to find corresponding object points
in the left and right camera images. For this, the algorithm requires texture, meaning changes in image inten-
sity values due to patterns or the objects’ surface structure, in the images. Stereo matching is not possible for
completely untextured regions, such as a flat white wall without any visible surface structure. The SGM stereo
matching method used provides the best trade-off between runtime and accuracy, even for fine structures.
24