|Título||STDnet-ST: Spatio-Temporal ConvNet for Small Object Detection|
|Autores||B. Bosquet, M. Mucientes and V. Brea|
|Tipo||Artículo de revista|
ELSEVIER SCI LTD,
No. Early access,
|Rank||Provisionally ranked Q1 in Software by CiteScore 2020|
|Abstract||Object detection through convolutional neural networks is reaching unprecedented levels of precision. However, a detailed analysis of the results shows that
the accuracy in the detection of small objects is still far from being satisfactory.
A recent trend that will likely improve the overall object detection success is
to use the spatial information operating alongside temporal video information.
This paper introduces STDnet-ST, an end-to-end spatio-temporal convolutional
neural network for small object detection in video. We define small as those objects under 16 × 16 px, where the features become less distinctive. STDnet-ST
is an architecture that detects small objects over time and correlates pairs of the
top-ranked regions with the highest likelihood of containing those small objects.
This permits to link the small objects across the time as tubelets. Furthermore,
we propose a procedure to dismiss unprofitable object links in order to provide
high quality tubelets, increasing the accuracy. STDnet-ST is evaluated on the
publicly accessible USC-GRAD-STDdb, UAVDT and VisDrone2019-VID video
datasets, where it achieves state-of-the-art results for small objects.|
|Palabras chave||small object detection, spatio-temporal convolutional network, object linking|