STDnet-ST: Spatio-Temporal ConvNet for Small Object Detection

TítuloSTDnet-ST: Spatio-Temporal ConvNet for Small Object Detection
AutoresB. Bosquet, M. Mucientes and V. Brea
TipoArtículo de revista
Fonte Pattern Recognition, ELSEVIER SCI LTD, Vol. 116, No. Early access, pp. 107929 , 2021.
RankProvisionally ranked Q1 in Software by CiteScore 2020
DOIhttps://doi.org/10.1016/j.patcog.2021.107929
AbstractObject detection through convolutional neural networks is reaching unprecedented levels of precision. However, a detailed analysis of the results shows that the accuracy in the detection of small objects is still far from being satisfactory. A recent trend that will likely improve the overall object detection success is to use the spatial information operating alongside temporal video information. This paper introduces STDnet-ST, an end-to-end spatio-temporal convolutional neural network for small object detection in video. We define small as those objects under 16 × 16 px, where the features become less distinctive. STDnet-ST is an architecture that detects small objects over time and correlates pairs of the top-ranked regions with the highest likelihood of containing those small objects. This permits to link the small objects across the time as tubelets. Furthermore, we propose a procedure to dismiss unprofitable object links in order to provide high quality tubelets, increasing the accuracy. STDnet-ST is evaluated on the publicly accessible USC-GRAD-STDdb, UAVDT and VisDrone2019-VID video datasets, where it achieves state-of-the-art results for small objects.
Palabras chavesmall object detection, spatio-temporal convolutional network, object linking