COVID-SC database
By Daniel Cores, Nicolás Vila, Manuel Mucientes, María José Carreira at CiTIUS and María Pérez Alarcón, Anxo Martínez de Alegría Alonso, Ana Castiñeira Estévez, Ana Ecenarro Montiel, Paula Sucasas Hermida, Diogo Miguel Machado Pereira at CHUS.
Introduction
The COVID-SC database is a set of 1,084 Chest X-ray Images (CXR) aimed at targeting the problem of COVID-19 detection. It is holded by Centro Singular de Investigación en Tecnoloxías Intelixentes (CiTIUS) of the University of Santiago de Compostela (USC) and by Complexo Hospitalario Universitario de Santiago (CHUS).
To the best of our knowledge, this database entails the most realistic clinical scenario for several reasons: it contains images acquired with the same type of portable device from different care units; it includes not only the RT-PCR diagnosis,but also the lung severity according to the RALE score, which allows for a fine-grained analysis; it also provides negative images belonging to patients affected by other observable diseases, classified into those related to COVID-19 and those which are not.
Request the database
If you are a researcher who wishes to have a copy of both the images and the associated metadata for educational or non-commercial use, we may provide you access to the data.
Please fill out the End User License Agreement (EULA) and send it filled and signed to these email address. You will be answered as soon as possible.
Database description
The COVID-SC database comprises 1,084 CXR images organized into different categories based on the RT-PCR result (with a maximum time span of 24h with respect to the image acquisition) and the visual diagnosis labeled by a group of six trained thoracic radiologists. Every image was acquired using the same type of portable device, and only AnteroPosterior (AP) view was considered.
The images belonging to patients tested negative were distributed into three different classes according to the experts' findings. This includes images of normal lungs, images where the lungs are affected by COVID-related conditions, such as pneumonia and interstitial lung disease, and those where the lungs are affected by other diseases.
Furthermore, the images belonging to patients tested positive were categorized according to the severity by following the RALE system, a score out of 8. Once the score is obtained, it is translated into a category, as follows:
RALE score | Severity |
---|---|
0 | Normal |
1 | Mild |
2 | Mild |
3 | Moderate |
4 | Moderate |
5 | Moderate |
6 | Moderate |
7 | Severe |
8 | Severe |
Distribution of images
RT-PCR result | # | Visual diagnosis | # | Label |
---|---|---|---|---|
Negative | 436 | Normal lungs | 227 | N_NORMAL |
Covid-related diseases | 76 | N_RELATED | ||
Other diseases | 133 | N_OTHER | ||
Positive | 648 | Normal lungs | 115 | P_NORMAL |
Mild condition | 141 | P_MILD | ||
Moderate condition | 290 | P_MODERATE | ||
Severe condition | 102 | P_SEVERE |
Examples
Negative categories
N_NORMAL | N_RELATED (Bilateral pneumonia) | N_OTHER (Interstitial disease) |
Positive categories
P_NORMAL | P_MILD | P_MODERATE | P_SEVERE |
Annotations file
A CSV metadata file is provided along with the folder of images. It contains a row per image, with the following fields
- filename (string) → The name of the file stored in the images folder
- rt-pcr (string) → The result of the RT-PCR test. It can take two values: NEGATIVE or POSITIVE
- label (string) → The label of the correspondent image, according to the table of the previous section. It can take seven values: N_NORMAL, N_RELATED, N_OTHER, P_NORMAL, P_MILD, P_MODERATE or P_SEVERE
Info
-
- Researchers
- Daniel Cores Costa
- Nicolás Vila Blanco
- Manuel Mucientes Molina
- María José Carreira Nouche
- Anxo Martínez de Alegría Alonso
- Ana Castiñeira Estévez
- Ana Ecenarro Montiel
- Paula Sucasas Hermida
- Diogo Miguel Machado Pereira
- María Pérez Alarcón