Long gone are the days when simple plots of parallel efficiency or parallel scaling were sufficient to understand performance issues in parallel software. It is widely accepted that analysing parallel performance is difficult, in part because of the complexity of modern parallel software, which often includes a mix of thread and process parallelism as well as GPU use, but also because of the complexity of the data generated by modern performance analysis tools.
To address this challenge the POP project is developing a methodology to help code developers more easily analyse trace data, and gain meaningful insight into the sources of inefficiency.
This methodology is based on the use of a small set of performance metrics, which can be generated easily from trace data, and which point at various specific code performance issues, such as process or thread load imbalance, MPI data transfer, various types of serialisation, and so on.
This webinar describes two complementary sets of metrics which can be used to identify performance bottlenecks in hybrid MPI + OpenMP software.