On the road to a unified Big Data and HPC framework

TítuloOn the road to a unified Big Data and HPC framework
Autores1- César Alfredo Piñeiro Pomar
TipoComunicación para congreso
Fonte 23rd IEEE International Parallel and Distributed Processing Symposium, Rome, Italy, 2021.
AbstractThe de facto standards for parallel processing of Big Data are Apache Hadoop and Apache Spark engines, which require implement their applications in Python, Java or Scala using programming paradigms such as MapReduce. However, HPC applications have always been implemented in Fortran and C/C++ in order to exploit multithreading (OpenMP) and multiprocessing (MPI) capabilities of clusters. As a consequence, this interoperability divergence between HPC and Big Data languages and programming models difficulties the creation of applications that bring together the advantages of HPC and Big data worlds [2]. To deal with that issue we introduce Ignis, a new Big Data-HPC framework that allows the execution of applications that combine multiple programming languages without additional overhead. Our framework uses a multi-language RPC approach to create a native executor for each language and a modular design pattern that facilitates the inclusion of new languages. All Ignis communications are internally implemented using MPI collective operations, which allows users to execute MPI native applications on an Ignis cluster. Unlike previous works, our proposal is a step forward in the convergence of HPC and Big Data since applications belonging to both worlds can be executed efficiently in the same framework.
Palabras chaveBig Data, Multi-language, Performance, Scalability, Container