USC-GRAD-STDdb: Small Target Detection database
By Brais Bosquet, Manuel Mucientes, Víctor Manuel Brea at CiTIUS and David de la Iglesia, Raquel Dosil, Daniel González at Gradiant.
Introduction
The Small Target Detection database (USC-GRAD-STDdb) is a set of annotated video segments retrieved from YouTube. USC-GRAD-STDdb is holded by Centro Singular de Investigación en Tecnoloxías da Información (CiTIUS) of the University of Santiago de Compostela (USC), Spain and by Centro Tecnológico de Telecomunicaciones de Galicia (Gradiant), Spain.
To the best of our knowledge, USC-GRAD-STDdb is the first database that has enough amount of small objects (objects smaller than ≈16x16 pixels) to train and test small object detection frameworks.
Citing USC-GRAD-STDdb
If you find USC-GRAD-STDdb useful in your research, please consider citing:
@InProceedings{Bosquet18_bmvc,
author = {B. Bosquet and M. Mucientes and V. Brea},
title = {{STDnet}: A {ConvNet} for Small Target Detection},
booktitle = {Proceedings of the 29th British Machine Vision Conference},
year = {2018},
address = {Newcastle ({UK})}
}
Contents
Request the database
Because the annotations in the database cannot be used for commercial purposes, we do not make them publicly available on our site. If you are a researcher who wishes to have a copy of the annotations for educational or non-commercial use, we may provide you access to the data.
Please fill out the End User License Agreement (EULA) and send it filled and signed to these email addresses. You will be answered as soon as possible.
Database description
USC-GRAD-STDdb comprises 115 video segments containing more than 25,000 annotated frames of HD 720p resolution (≈1280x720) with small objects of interest from 16 (≈4x4) to 256 (≈16x16) as pixel area. The length of the videos changes from 150 up to 500 frames. The size of every object is determined through the bounding box, so that a good annotation is of utmost importance for reliable performance metrics. As it may seem obvious, the smaller the object, the harder the annotation. The annotation has been carried out with the ViTBAT tool, adjusting the boxes as much as possible to the objects of interest in each video frame. In total, more than 56,000 ground truth labels have been generated.
The figure below shows some samples of USC-GRAD-STDdb video annotations:
As there are many potential small object candidates, we restrict the annotation to those targets that can potentially move, even if they can be still in a given frame or set of frames. Finally, the majority of the videos in USC-GRAD-STDdb are recorded by drones or with bird eye view over three main landscapes and five object categories, namely:
- Air: drone, bird (57 video segments, 12,139 frames)
- Sea: boat (28 video segments, 7,099 frames)
- Land: vehicle, person (30 video segments, 6,619 frames)
Annotations format
The annotations are in .txt
files stored in std_database/(video_name)/
folders. There is an annotation file for each video frame if it contains an object. Each line of the file belongs to a different small object whose format is as following:
id,x,y,width,height,category
where id
is the id of the object in this video; x
and y
are the coordinates of the upper left corner of the bounding box; width
and height
are the bounding box extension in x
and y
respectively and category
is the category of the object:
- 1: drone
- 2: boat
- 3: vehicle
- 4: person
- 5: bird
Additionally, when the database is generated and splitted for training and testing phase, the data is exported to the COCO dataset format for Object Detection Task that is compatible with its COCO API for loading, parsing, and visualizing annotations.
Requirements
Python 2.7 with OpenCV (> 2.9) and youtube_dl.
Usage
Run
dl-dataset.py
script to get YouTube segments for each annotation:cd $STDdb_ROOT python dl-dataset.py
Run
createDB_train-test.py
script to split annotations in train and test sets:cd $STDdb_ROOT python createDB_train-test.py
Done
Info
-
- Researchers
- Manuel Mucientes Molina
- Víctor Manuel Brea Sánchez
- Brais Bosquet Mera