RENCI researchers recently received the 2021 Best Paper Award from the Elsevier Future Generation Computer Systems (FGCS) Journal. The paper, titled “End-to-end online performance data capture and analysis for scientific workflows,” was co-authored by Cong Wang, Anirban Mandal, and collaborators from the DOE Panorama and RAMSES projects.
The FGCS Journal aims to lead the way in advances in distributed systems, collaborative environments, high performance computing (HPC), and big data on such infrastructures as grids, clouds, and the Internet of Things. Each year, the editorial board awards “Best Paper” to one submission featured in the journal.
The editorial board chose this paper for the award with the following justification: “The paper presents a rare, successful example of end-to-end workflows that builds on cutting-edge HPC and Cloud technologies. Its contributions are transferable across domains and applications. The open access data repository containing experimental traces can support transparency and reproducibility of the presented artifacts.”
With the increased prevalence of employing workflows for scientific computing and a push towards exascale computing, it has become paramount that researchers are able to analyze characteristics of scientific applications to better understand their impact on the underlying infrastructure and vice-versa. Such analysis can help drive the design, development, and optimization of these next generation systems and solutions. In the paper, the authors presented the architecture, integrated with existing well-established and newly developed tools, to collect online performance statistics of workflow executions from various, heterogeneous sources and publish them in a distributed database.
RENCI team members were responsible for contributing to the design of the overall end-to-end data collection architecture and for provisioning and managing the computing and network infrastructure used to evaluate the scientific workflows presented in the paper. The members also helped in collection and analysis of the network performance data for the workflows.
You can read the full paper here.