TR-15-01 Adapting Scientific Workflows on Networked Clouds Using Proactive Introspection

Recent advances in cloud technologies and on- demand network circuits have created an unprecedented opportunity to enable complex data-intensive scientific applications to run on dynamic, networked cloud infrastructure. However, there is a lack of tools for supporting high-level applications like scientific workflows on dynamically provisioned, virtualized, networked IaaS (NIaaS) systems. In this paper, we propose an architectural framework consisting of application-aware and application-independent controllers that provision and adapt complex scientific workflows on NIaaS systems. The application- independent controller simplifies the use of NIaaS systems by higher-level applications by closing the gap between applica- tion abstractions and resource provisioning constructs. We also present our approach to predicting dynamic resource require- ments for workflows using an application-aware controller that proactively evaluates alternative candidate resource allotments using workflow introspection. We show how these high-level resource requirements can be automatically transformed to low- level NIaaS operations to actuate infrastructure adaptation. The results of our evaluations show that we can make fairly accurate predictions, and the interplay of prediction and adaptation can balance performance and utilization for a representative data- intensive workflow.