Exploring Techniques for the Analysis of Spontaneous Asynchronicity in MPI-Parallel Applications
Afzal, Ayesha, Hager, Georg, Wellein, Gerhard, Markidis, Stefano
–arXiv.org Artificial Intelligence
This paper studies the utility of using data analytics and machine learning techniques for identifying, classifying, and characterizing the dynamics of large-scale parallel (MPI) programs. To this end, we run microbenchmarks and realistic proxy applications with the regular compute-communicate structure on two different supercomputing platforms and choose the per-process performance and MPI time per time step as relevant observables. Using principal component analysis, clustering techniques, correlation functions, and a new "phase space plot," we show how desynchronization patterns (or lack thereof) can be readily identified from a data set that is much smaller than a full MPI trace. Our methods also lead the way towards a more general classification of parallel program dynamics.
arXiv.org Artificial Intelligence
May-27-2022
- Country:
- Europe
- Germany > Bavaria
- Middle Franconia > Nuremberg (0.04)
- Sweden > Stockholm
- Stockholm (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Germany > Bavaria
- North America > United States
- New Mexico > Bernalillo County > Albuquerque (0.04)
- Europe
- Genre:
- Research Report (1.00)
- Technology: