Abstract
This paper sets out to examine the future of performance monitoring on exascale HPC systems. In particular we put forth the idea that such machines will be sufficiently complex that performance monitoring of individual applications and the workload as a whole will change from being a beneficial option to being a necessity. This complexity arises from the number of components and concurrencies expected for such systems. We see the need for a shift from performance monitoring being a useful add-on toward it being a core requirement for basic operation and suggest some first steps toward meeting that need.
Get full access to this article
View all access options for this article.
