Specifications

4.2 Object Identification and Tracking

The output of the object identification stage is an object track list. Each object that has been

identified across multiple frames is assigned a unique object track. Each object track is thus a list

of references to the relevant object entries in the object detection list.

In order to link together objects across frames, each object in each frame of the recorded digital

video is compared to each object in the previous frame. Those objects that are less than the

threshold horizontal distance δ apart in space and have a difference in their area less than the

threshold area difference α are considered candidate matches. For each object in the current

frame, the single best match is determined by taking the closest candidate object from the

previous frame. A reference to the object in the current frame is then added to the object track

that refers to the matched object in the previous frame.

In the case that no candidate matches are found, a new object track is added to the object track

list. All objects in the first frame of the video generate new object tracks. Objects entering the

video frame also generate new object tracks. Objects that are temporarily occluded can also

generate new object tracks upon re-appearance in the frame. Thus, the final object tracking list

may contain multiple object tracks for a single object that moves through the scene, depending

on the frequency with which the object was occluded by other objects or scene elements.

Multiple object tracks may refer to some of the same object.

4.3 Object Track Filtering

The goal of object track filtering stage is to use the object tracks to improve the estimate of the

object position, shape, and size. In the cases where an object is occluded at the edge of the scene

or by another vehicle, the estimates of the object properties are poor. These estimates can be

improved by making the assumption that the size of the object being tracked does not change

during tracking. The object size (i.e., width and height in the image) for each frame that the

object is detected can be treated as a sample. The best guess at the true object size is taken as the

mode of all of the samples. If the observed size of the object in any individual frame is greater

than σ standard deviations from the modal size, the size in that frame is reset to the modal size.

The object bounding box is then adjusted around the observed center of the object. If, however,

one edge of the object is near the edge of the image, it is assumed that the object is partially

occluded. In this case, the observed object center is a poor estimate of the true object center.

Therefore, the new bounding box is set based on the visible object boundary instead of the object

center. The object center is then adjusted according to the object boundary. This correction

allows the vehicle to be correctly tracked even when it is only partially in the frame.

4.4 Object Occlusion Detection and Correction

The goal of the object identification stage is to create an object track list where each track

uniquely corresponds to a single object in the scene, for every detected instance of the object in

the video. This requires the correct identification of the object across occlusions.