Sparse Matrix Classification on Imbalanced Datasets using Convolutional Neural Networks

This paper deals with the class imbalance problem in the context of the automatic selection of the best storage format for a sparse matrix with the aim of maximizing the performance of the Sparse Matrix Vector Multiplication (SpMV) on GPUs. Our classification method uses Convolutional Neural Networks (CNNs), and proposes several solutions to mitigate the bias towards the majority classes when data are not balanced. First, CNNs are trained using images that represent the sparsity pattern of the matrices, whose pixels are colored according to different matrix features. In addition, we introduce a new network called SpNet, which achieves better results than a standard network as AlexNet in terms of prediction accuracy even having a more simple architecture. Finally, sampling techniques and cost-sensitive methods have been studied to give more emphasis on minority classes. Experiments conducted show that our classifiers are able to select the best performing format 92.8% of the time, obtaining 98.3% of the maximum attainable SpMV performance. A comparison to other state-of-the-art classification methods is also provided, demonstrating the benefits of our proposal.

keywords: Sparse matrix, Classification, Imbalance, Deep Learning, CNN, Performance