In the near future, large-scale computing systems are likely to vary greatly in their execution behavior due to massive component counts, relatively high fault rates, and potential performance fluctuations due to power and energy constraints. Electrical costs will be an important factor in determining the total cost of ownership, which means that applications and operating systems that minimize energy consumption wherever possible will be in great demand. Optimizing for power and energy will be critical in future systems for two reasons:
- The annual energy cost of large-scale systems will impose fundamental design constraints.
- The end of Dennard scaling means that power density in semiconductor chips is no longer constant with advancing Very Large Scale Integration (VLSI) technology. It will no longer be possible to scale up the number of cores and keep them operational at current clock frequencies. Selectively turning off cores or components on cores will be essential, and compiler analyses will be necessary to determine the best configuration for an application.
While significant efforts over several decades have been directed towards compiler analyses and auto-tuning for performance optimization, much less work has addressed power issues. This project fills that gap. The auto-tuning framework will consider offline parameter space exploration coupled with generation of different optimized code versions, online runtime introspection, and just-in-time recompilation. Just-in-time compilation will focus on localized transformations that are based on online power/energy information and needs to carefully consider the trade-off between recompilation cost and anticipated benefits. Machine learning-based techniques will be applied to both offline and dynamic compilation to enable the generation of: (a) multiple versions of the code that can be adapted to different execution scenarios without major code bloat; and (b) online adaptation when the full scope of the machine environment and input data sets are known.
RENCI focuses on acquiring accurate power and energy numbers that can be correlated with phases, loops, or lines of the original source code. This requires the accurate gathering of internal and external counters for a variety of system architectures and providing them to the user in a uniform manner. With the detailed energy information about different program sections, RENCI is exploring dynamic mechanisms for reducing energy performance without affecting overall execution time.
RENCI also collects information about energy variation between runs and between components during a single run. This information will be used in conjunction with the energy maps to design scheduling algorithms for both power and performance.
- Allan Porterfield (Principal Investigator)
- Rob Fowler
- Rob Lewis
- Pacific Northwest National Laboratory
- Stanford University
- Ohio State University
- University of Delaware