USC-GRAD-STDdb: Small Target Detection database

By Brais Bosquet, Manuel Mucientes, Víctor Manuel Brea at CiTIUS and David de la Iglesia, Raquel Dosil, Daniel González at Gradiant.

Introduction

The Small Target Detection database (USC-GRAD-STDdb) is a set of annotated video segments retrieved from YouTube. USC-GRAD-STDdb is holded by Centro Singular de Investigación en Tecnoloxías da Información (CiTIUS) of the University of Santiago de Compostela (USC), Spain and by Centro Tecnológico de Telecomunicaciones de Galicia (Gradiant), Spain.

To the best of our knowledge, USC-GRAD-STDdb is the first database that has enough amount of small objects (objects smaller than ≈16x16 pixels) to train and test small object detection frameworks.

Citing USC-GRAD-STDdb

If you find USC-GRAD-STDdb useful in your research, please consider citing:

@InProceedings{Bosquet18_bmvc,
    author = {B. Bosquet and M. Mucientes and V. Brea},
    title = {{STDnet}: A {ConvNet} for Small Target Detection},
    booktitle = {Proceedings of the 29th British Machine Vision Conference},
    year = {2018},
    address = {Newcastle ({UK})}
}

Request the database
Database description
Annotations format
Requirements
Usage

Request the database

Because the annotations in the database cannot be used for commercial purposes, we do not make them publicly available on our site. If you are a researcher who wishes to have a copy of the annotations for educational or non-commercial use, we may provide you access to the data.

Please fill out the End User License Agreement (EULA) and send it filled and signed to these email addresses. You will be answered as soon as possible.

Database description

USC-GRAD-STDdb comprises 115 video segments containing more than 25,000 annotated frames of HD 720p resolution (≈1280x720) with small objects of interest from 16 (≈4x4) to 256 (≈16x16) as pixel area. The length of the videos changes from 150 up to 500 frames. The size of every object is determined through the bounding box, so that a good annotation is of utmost importance for reliable performance metrics. As it may seem obvious, the smaller the object, the harder the annotation. The annotation has been carried out with the ViTBAT tool, adjusting the boxes as much as possible to the objects of interest in each video frame. In total, more than 56,000 ground truth labels have been generated.

The figure below shows some samples of USC-GRAD-STDdb video annotations:

As there are many potential small object candidates, we restrict the annotation to those targets that can potentially move, even if they can be still in a given frame or set of frames. Finally, the majority of the videos in USC-GRAD-STDdb are recorded by drones or with bird eye view over three main landscapes and five object categories, namely:

Air: drone, bird (57 video segments, 12,139 frames)
Sea: boat (28 video segments, 7,099 frames)
Land: vehicle, person (30 video segments, 6,619 frames)

Annotations format

The annotations are in .txt files stored in std_database/(video_name)/ folders. There is an annotation file for each video frame if it contains an object. Each line of the file belongs to a different small object whose format is as following:

  id,x,y,width,height,category

where id is the id of the object in this video; x and y are the coordinates of the upper left corner of the bounding box; width and height are the bounding box extension in x and y respectively and category is the category of the object:

1: drone
2: boat
3: vehicle
4: person
5: bird

Additionally, when the database is generated and splitted for training and testing phase, the data is exported to the COCO dataset format for Object Detection Task that is compatible with its COCO API for loading, parsing, and visualizing annotations.

Requirements

Python 2.7 with OpenCV (> 2.9) and youtube_dl.

Usage

Run dl-dataset.py script to get YouTube segments for each annotation:
```
cd $STDdb_ROOT
python dl-dataset.py
```
Run createDB_train-test.py script to split annotations in train and test sets:
```
cd $STDdb_ROOT
python createDB_train-test.py
```
Done

Información

- Investigadores
- Manuel Mucientes Molina
- Víctor Manuel Brea Sánchez
- Brais Bosquet Mera

Descargar

Publicacións

Programas científicos

Visión artificial (antigo)

CiTIUS - Campus Vida - Universidade de Santiago de Compostela

Selector de idioma custom

Menú principal

USC-GRAD-STDdb