Specifications
8
4.2 Object Identification and Tracking
The output of the object identification stage is an object track list. Each object that has been
identified across multiple frames is assigned a unique object track. Each object track is thus a list
of references to the relevant object entries in the object detection list.
In order to link together objects across frames, each object in each frame of the recorded digital
video is compared to each object in the previous frame. Those objects that are less than the
threshold horizontal distance δ apart in space and have a difference in their area less than the
threshold area difference α are considered candidate matches. For each object in the current
frame, the single best match is determined by taking the closest candidate object from the
previous frame. A reference to the object in the current frame is then added to the object track
that refers to the matched object in the previous frame.
In the case that no candidate matches are found, a new object track is added to the object track
list. All objects in the first frame of the video generate new object tracks. Objects entering the
video frame also generate new object tracks. Objects that are temporarily occluded can also
generate new object tracks upon re-appearance in the frame. Thus, the final object tracking list
may contain multiple object tracks for a single object that moves through the scene, depending
on the frequency with which the object was occluded by other objects or scene elements.
Multiple object tracks may refer to some of the same object.
4.3 Object Track Filtering
The goal of object track filtering stage is to use the object tracks to improve the estimate of the
object position, shape, and size. In the cases where an object is occluded at the edge of the scene
or by another vehicle, the estimates of the object properties are poor. These estimates can be
improved by making the assumption that the size of the object being tracked does not change
during tracking. The object size (i.e., width and height in the image) for each frame that the
object is detected can be treated as a sample. The best guess at the true object size is taken as the
mode of all of the samples. If the observed size of the object in any individual frame is greater
than σ standard deviations from the modal size, the size in that frame is reset to the modal size.
The object bounding box is then adjusted around the observed center of the object. If, however,
one edge of the object is near the edge of the image, it is assumed that the object is partially
occluded. In this case, the observed object center is a poor estimate of the true object center.
Therefore, the new bounding box is set based on the visible object boundary instead of the object
center. The object center is then adjusted according to the object boundary. This correction
allows the vehicle to be correctly tracked even when it is only partially in the frame.
4.4 Object Occlusion Detection and Correction
The goal of the object identification stage is to create an object track list where each track
uniquely corresponds to a single object in the scene, for every detected instance of the object in
the video. This requires the correct identification of the object across occlusions.