How to Ensure Accurate Results from Video Analytics

By Marzieh (Niloofar) Aliakbarzadeh

The accuracy of the analytics provided by AMAG is dependent on the quality of the data received as input. Data may be provided by video cameras or lidar but most commonly the source of data is video footage collected from either permanent or temporarily installed cameras. There are multiple factors that influence the quality of the data that can be extracted from videos that need to be considered.

Low Resolution Video

The object classification and tracking algorithms rely on the pixel images recorded and are more accurate with high quality videos. Video resolutions lower than 1280 x 720 will decrease the accuracy of the algorithms and the derived data. At lower resolutions road users may be mis-classified or missed entirely and their trajectories less accurate.

Shows difference in quality between resolutions

Furthermore, the impact of deficient video resolution has an increasingly detrimental effect as the distance from the camera increases. Like with the human eye, objects farther away are always more difficult to classify and detect than closer objects. At a typical signalized intersection, road users on the far side of the intersection cannot be detected effectively with video lower than the minimum resolution.

Highly Compressed Video

When video is recorded it can be compressed to decrease the stored file size. Video that is highly compressed can also cause significant problems with classification and detection. Video should be collected at ‘high quality’ settings within cameras, rather than at highly compressed rates which imposes distortion into the images.

Low Camera Mounting Position

Cameras mounted below 6.5 meters typically will be too low to effectively differentiate road users. This is a ‘line of sight’ problem where if an object cannot be seen in a video it cannot be classified or detected. Low camera mounting heights result in many road users being obscured by vehicles in front of it in the camera view. Large areas of interest will require higher camera mounts than smaller locations and while 6.5 meters is the minimum height, 10 to 12 meters is preferred and ideal.

Low Video Frame Rate

Capturing videos with less than 10 frames per second will decrease the accuracy of the measurements of road user positions in time and space. Without accurate measurements of trajectories the accurate identification of near-miss events will suffer.

View Obstruction

If the camera view is partially obstructed by a large pole, sign-posts, vegetation etc., the object detection algorithm will fail to capture the road users being obstructed. A simple rule is if we can’t see it through our eyes, the computer vision will not be able to either.

About the Author

Marzieh (Niloofar) Aliakbarzadeh is a computer vision/artificial intelligence engineer and Manager of the computer vision team at AMAG. She excels in AI-based techniques for faster and more efficient video and data analytics.