Adrian Jackson (EPCC) gave this presentation at the 13th Parallel Tools Wokshop, Dresden, Germany, 2-3 September 2019.
Abstract: The NEXTGenIO project, which started in 2015 and is co-funded under the European Horizon 2020 R&D funding scheme, was one of the very first projects to investigate the use of Optane DC PMM for the HPC segment in detail. Fujitsu have built up a 34-node prototype Cluster at EPCC using Intel Xeon Scalable CPUs (Cascade Lake generation), DC PMM (3 TBytes per dual-socket node), and Intel Omni-Path Architecture (a dual-rail fabric across the 32 nodes). A selection of eight pilot applications ranging from an OpenFOAM use case to the Halvade genomic processing workflow have been studied in detail, and suitable middleware components for the effective use of DC PMM by these applications created. Actual benchmarking with DC PMM is now possible, and this talk will discuss how DC PMM can be used for I/O, large memory capacities, workflow data sharing and storage, and long term data retention.
Using DC PMM as local storage targets, OpenFOAM and Halvade workflows show a very significant reduction in I/O times required by passing data between workflow steps, and consequently, significantly reduced runtimes and increased strong scaling. Taking this further, a prototype setup of ECMWF's IFS forecasting system, which combines the actual weather forecast with several dozens of post-processing steps, does show the vast potential of DC PMM: forecast data is stored in DC PMM on the nodes running the forecast, while post-processing steps can quickly access this data via the OPA network fabric, and a meteorological archive pulls the data into long-term storage. Compared to the traditional system configurations, this scheme brings significant savings in time to completion for the full workflow. Furthermore, we have undertaken work to develop and deploy distributed filesystems that use the DC PMM in compute nodes and enable applications and users to interactive with data store across a set of compute nodes from any one individual compute node. In this contribution we will discuss the functionality and performance that DC PMM can provide and the integration with systemware and tools that we have undertaken to support the functionality.