Real-Time Multiple Object Visual Tracking for Embedded GPU Systems

TítuloReal-Time Multiple Object Visual Tracking for Embedded GPU Systems
AutoresM. Fernández-Sanjurjo, M. Mucientes, V.M. Brea
TipoArtículo de revista
Fonte IEEE Internet of Things Journal, IEEE, Vol. 8, pp. 9177-9188 , 2021.
RankProvisionally ranked Q1 in Hardware and Architecture by CiteScore 2020
DOI10.1109/JIOT.2021.3056239
AbstractReal-time visual object tracking provides every object of interest with a unique identity and a trajectory across video frames. This is a fundamental task of many video analytics applications like traffic monitoring, or video surveillance in general. The development of real-time multiple object tracking systems on low-power edge devices as IoT nodes, without compromising accuracy, is a challenge due to the limited computing capacity of said devices. This might rule out the best in-class computer vision solutions, which nowadays are based on deep learning, and thus, they are very hardware demanding. This paper meets this challenge with a multiple object detection and tracking system that employs cutting-edge deep learning architectures on an embedded GPU while operating in real-time. For this purpose, a system has been designed that extends a joint architecture of tracking and detection by adding a module comprised of appearance-based and movement-based trackers that allow to maintain the identity of the objects of interest for longer periods of time while alleviating the burden of the detector. Our system is mapped onto an embedded GPU platform, cutting down power consumption significantly with respect to a server GPU. Tracking performance metrics show a 51.1% in Multiple Object Tracking Accuracy (MOTA) on the MOT16 dataset. This, in conjunction with a real-time processing speed of 25.2 FPS for up to 45 simultaneous objects and low power consumption of 15W, make our system an ideal solution for a wide-range of video analytics applications.
Palabras chaveReal-time systems, Feature extraction, Detectors, Deep learning, Hardware, Computer vision, Computer architecture