Fast SVC for large-scale classification problems

TítuloFast SVC for large-scale classification problems
AutoresZiad Akram, Ali Hammouri, Manuel Fernández Delgado, Eva Cernadas, Senen Barro
TipoArtículo de revista
Fonte IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE COMPUTER SOC, pp. 12 , 2021.
RankProvisionally ranked Q1 in Applied Mathematics by CiteScore 2020
AbstractThe Support Vector Machine (SVM) is a state-of-the-art classifier that for large datasets is very slow and requires much memory. To solve this defficiency, we propose the Fast Support Vector Classifier (FSVC) that includes: 1) an efficient closed-form training without numerical procedures; 2) a small collection of class prototypes instead of support vectors; and 3) a fast method that selects the spread of the radial basis function kernel directly from data. Its storage requirements are very low and can be adjusted to the available memory, being able to classify any dataset of arbitrarily large sizes (31 millions of patterns, 30,000 inputs and 131 classes in less than 1.5 hours). The FSVC spends 12 times less memory than Liblinear, that fails on the 4 largest datasets by lack of memory, being one and two orders of magnitude faster than Liblinear and Libsvm, respectively. Comparing performance, FSVC is 4.1 points above Liblinear and only 6.7 points below Libsvm. The time spent by FSVC only depends on the dataset size (610^-7 sec. per pattern, input and class) and can be accurately estimated for new datasets, while for Libsvm and Liblinear depends on the dataset difficulty. Code is provided.
Palabras chaveClassification, large-scale datasets, SVM, closed-form training, model selection