Título | STDnet-ST: Spatio-Temporal ConvNet for Small Object Detection |
---|---|
Autores | B. Bosquet, M. Mucientes and V. Brea |
Tipo | Artículo de revista |
Fonte | Pattern Recognition, ELSEVIER SCI LTD , Vol. Early access, No. Early access, pp. 39 , 2021. |
Rank | Provisionally ranked Q1 in Software by SJR 2019 |
ISSN | 0031-3203 |
DOI | https://doi.org/10.1016/j.patcog.2021.107929 |
Abstract | Object detection through convolutional neural networks is reaching unprecedented levels of precision. However, a detailed analysis of the results shows that the accuracy in the detection of small objects is still far from being satisfactory. A recent trend that will likely improve the overall object detection success is to use the spatial information operating alongside temporal video information. This paper introduces STDnet-ST, an end-to-end spatio-temporal convolutional neural network for small object detection in video. We define small as those objects under 16 × 16 px, where the features become less distinctive. STDnet-ST is an architecture that detects small objects over time and correlates pairs of the top-ranked regions with the highest likelihood of containing those small objects. This permits to link the small objects across the time as tubelets. Furthermore, we propose a procedure to dismiss unprofitable object links in order to provide high quality tubelets, increasing the accuracy. STDnet-ST is evaluated on the publicly accessible USC-GRAD-STDdb, UAVDT and VisDrone2019-VID video datasets, where it achieves state-of-the-art results for small objects. |
Palabras chave | small object detection, spatio-temporal convolutional network, object linking |