This paper was accepted to the 23rd International Workshop on High-Level Parallel Programming Models and Supportive Environments on the 32nd IEEE International Parallel and Distributed Processing Symposium.
Nowadays, high performance computing systems provide a wide range of storage technologies like HDDs, SSDs or network devices. With the introduction of NVRAM, these systems become more heterogeneous and finally provide a complex I/O stack that is challenging to use for applications. However, parallel programs have to efficiently utilize available I/O resources to overcome the scalability problem. Typically, performance analysis tools focus on investigating computation efficiency, executed program paths, and communication patterns. However, these tools only visualize I/O performance information of single layers of the I/O stack. To fully understand the I/O behavior of an application, it is necessary to investigate the interaction between the layers.
This work introduces new visualizations of I/O performance events and metrics throughout the complete I/O stack of parallel applications. We implement our approach on the basis of the performance analysis tool Vampir. We extend its timeline visualizations with performance details of I/O operations. Further, we introduce a new timeline view which depicts I/O activities on each layer of the used I/O stack as well as the interaction between layers. This view enables application developers to identify I/O bottlenecks across layers of a complicated I/O stack. We demonstrate our I/O performance visualization approach with a case study of a cloud model simulation code. Thereby, we analyze the I/O behavior in detail, including information of all involved multi-layered I/O libraries.