OPTIMIZA: Optimization of irregular applications on high performance CPU/GPU emerging architectures

The industrial transition towards multicore processors has been one of the main achievements of the computing history. Nevertheless, improvements in processor power have not been accompanied by improvements of applications performance. Moreover, architectural configurations of current multicore systems are so diverse that specific optimization techniques are needed for the different architectures.

One of the main reasons for this diversity is the difficulty to achieve an equilibrium between memory and processors. Every half and a year the processors performance is increased by a factor of two, whilst memory needs ten years for obtaining a similar improvement. In the case of multicore architectures, the industrial trend is to increase the number of cores instead of the bandwidth for cost and efficiency reasons. So, memory hierarchy will continue to be the key factor in the performance of future multicore applications.

The case of irregular applications is especially difficult. In this case, the locality principle that sustains the hierarchy memory is not fulfilled. Hence, performance will be much lower than of standard applications (about 10% of the peak performance). Irregular applications are the more demanded ones in the scientific community, including microelectronic device simulations, fluid mechanics, n-body problems (astrophysics, molecular dynamics, etc.)

Objectives

The general goal of this Project is to investigate on the parallelization and optimization of irregular applications in the context of the new market standards of the following years: hybrid multicore CPU/GPU systems. The Project has the following specific objectives:

To evaluate GPUs and their programming models for the development and optimization of irregular applications on high performance computing systems.
To extend the hierarchy memory models previously developed by the research team for hybrid multicore/GPU architectures.
To develop tools to exploit the hierarchy memory with irregular applications in order to facilitate the programmability of these systems. In particular:
- In the case of multicore CPUs, to investigate the development of techniques for automatic page migration.
- In the case of GPUs, to develop a mathematical library for sparse algebra codes.

The industrial transition towards multicore processors has been one of the main achievements of the computing history. Nevertheless, improvements in processor power have not been accompanied by improvements of applications performance. Moreover, architectural configurations of current multicore systems are so diverse that specific optimization techniques are needed for the different architectures.One of the main reasons for this diversity is the difficulty to achieve an equilibrium between memory and processors. Every half and a year the processors performance is increased by a factor of two, whilst memory needs ten years for obtaining a similar improvement. In the case of multicore architectures, the industrial trend is to increase the number of cores instead of the bandwidth for cost and efficiency reasons. So, memory hierarchy will continue to be the key factor in the performance of future multicore applications.The case of irregular applications is especially difficult. In this case, the locality principle that sustains the hierarchy memory is not fulfilled. Hence, performance will be much lower than of standard applications (about 10% of the peak performance). Irregular applications are the more demanded ones in the scientific community, including microelectronic device simulations, fluid mechanics, n-body problems (astrophysics, molecular dynamics, etc.)The general goal of this Project is to investigate on the parallelization and optimization of irregular applications in the context of the new market standards of the following years: hybrid multicore CPU/GPU systems. The Project has the following specific objectives:<ul><li>To evaluate GPUs and their programming models for the development and optimization of irregular applications on high performance computing systems.</li><li>To extend the hierarchy memory models previously developed by the research team for hybrid multicore/GPU architectures.</li><li>To develop tools to exploit the hierarchy memory with irregular applications in order to facilitate the programmability of these systems. In particular:<ul><li>In the case of multicore CPUs, to investigate the development of techniques for automatic page migration.</li><li>In the case of GPUs, to develop a mathematical library for sparse algebra codes.</li></ul></li></ul> - Juan Carlos Pichel Campos - José Carlos Cabaleiro Domínguez, Tomás Fernández Pena, Dora Blanco Heras, Francisco Fernández Rivera