TR-08-02 Stateful grid resource selection for related asynchronous tasks

Howard M. Lander, Robert J. Fowler, Lavanya Ramakrishnan, and Steven R. Thorpe. Stateful grid resource selection for related asynchronous tasks. Technical Report TR-08-02, RENCI, North Carolina, April 2008.

In today’s grid deployments, resource selection is based on the prior knowledge of the performance characteristics of the application on a particular resource and on real-time monitoring status of the resource such as load on the system, network bandwidth, etc. Any lag between a resource selection decision and the time the job appears in the system’s monitoring facility will cause subsequent decisions to be based on incorrect information. If two or more jobs arrive within this hysteresis window, the incorrect assessment of system state can have negative consequences on job response time and system throughput. In this paper we describe a stateful resource selection protocol we designed to mitigate this problem for a real time storm surge modeling project. We present results from real experiments on a regional grid. We use emulation to compare and study the effect of our protocol under varying load conditions. Based on our evaluation we argue that the enhanced protocol should be made available as a globally-aware grid resource selection service.

@TECHREPORT{LFRT2008:tr0802,
AUTHOR = {Howard M. Lander and Robert J. Fowler and Lavanya
Ramakrishnan and Stevern R. Thorpe},
TITLE = {Stateful Grid Resource Selection for Related Asynchronous Tasks},
INSTITUTION = {RENCI},
YEAR = {2008},
NUMBER = {TR-08-02},
ADDRESS = {North Carolina},
MONTH = {April},
OPTNOTE = {also submitted for publication},
ABSTRACT = {In today’s grid deployments, resource selection is based on the prior knowledge of the performance characteristics of the application on a particular resource and on real-time monitoring status of the resource such as load on the system, network bandwidth, etc. Any lag between a resource selection decision and the time the job appears in the system’s monitoring facility will cause subsequent decisions to be based on incorrect information. If two or more jobs arrive within this hysteresis window, the incorrect assessment of system state can have negative consequences on job response time and system throughput. In this paper we describe a stateful resource selection protocol we designed to mitigate this problem for a real time storm surge modeling project. We present results from real experiments on a regional grid. We use emulation to compare and study the effect of our protocol under varying load conditions. Based on our evaluation we argue that the enhanced protocol should be made available as a globally-aware grid resource selection service.},
URL = {http://www.renci.org/publications/techreports/TR0802.pdf}
}