Temporal QOS Management in Scientific Cloud Workflow Systems

Temporal QOS Management in Scientific Cloud Workflow Systems

Temporal QOS Management in Scientific Cloud Workflow Systems

Temporal QOS Management in Scientific Cloud Workflow Systems

eBook

$37.49  $49.95 Save 25% Current price is $37.49, Original price is $49.95. You Save 25%.

Available on Compatible NOOK devices, the free NOOK App and in My Digital Library.
WANT A NOOK?  Explore Now

Related collections and offers


Overview

Cloud computing can provide virtually unlimited scalable high performance computing resources. Cloud workflows often underlie many large scale data/computation intensive e-science applications such as earthquake modelling, weather forecasting and astrophysics. During application modelling, these sophisticated processes are redesigned as cloud workflows, and at runtime, the models are executed by employing the supercomputing and data sharing ability of the underlying cloud computing infrastructures.

Temporal QOS Management in Scientific Cloud Workflow Systems focuses on real world scientific applications which often must be completed by satisfying a set of temporal constraints such as milestones and deadlines. Meanwhile, activity duration, as a measurement of system performance, often needs to be monitored and controlled. This book demonstrates how to guarantee on-time completion of most, if not all, workflow applications. Offering a comprehensive framework to support the lifecycle of time-constrained workflow applications, this book will enhance the overall performance and usability of scientific cloud workflow systems.
  • Explains how to reduce the cost to detect and handle temporal violations while delivering high quality of service (QoS)
  • Offers new concepts, innovative strategies and algorithms to support large-scale sophisticated applications in the cloud
  • Improves the overall performance and usability of cloud workflow systems

Product Details

ISBN-13: 9780123972958
Publisher: Elsevier Science
Publication date: 02/20/2012
Sold by: Barnes & Noble
Format: eBook
Pages: 154
File size: 4 MB

About the Author

Xiao Liu received his PhD degree in Computer Science and Software Engineering from the Faculty of Information and Communication Technologies at Swinburne University of Technology, Melbourne, Australia in 2011. He received his Master and Bachelor degree from the School of Management, Hefei University of Technology, Hefei, China, in 2007 and 2004 respectively, all in Information Management and Information Systems. He is currently a postdoctoral research fellow in the Centre of Computing and Engineering Software System at Swinburne University of Technology. His research interests include workflow management systems, scientific workflows, cloud computing, business process management and quality of service.
Jinjun Chen received his PhD degree in Computer Science and Software Engineering from Swinburne University of Technology, Melbourne, Australia in 2007. He is currently an Associate Professor in the Faculty of Engineering and Information Technology, University of Technology, Sydney, Australia. His research interests include Scientific workflow management and applications, workflow management and applications in Web service or SOC environments, workflow management and applications in grid (service)/cloud computing environments, software verification and validation in workflow systems, QoS and resource scheduling in distributed computing systems such as cloud computing, service oriented computing, semantics and knowledge management, cloud computing.
Yun Yang is currently a full professor in School of Software and Electrical Engineering at Swinburne University of Technology, Melbourne, Australia. Prior to joining Swinburne in 1999 as an associate professor, he was a lecturer and senior lecturer at Deakin University, Australia, during 1996-1999. He has coauthored four books and published over 200 papers in journals and refereed conference proceedings. He is currently on the Editorial Board of IEEE Transactions on Cloud Computing. His current research interests include software technologies, cloud computing, p2p/grid/cloud workflow systems, and service-oriented computing.

Read an Excerpt

Temporal QoS Management in Scientific Cloud Workflow Systems


By Xiao Liu Yun Yang Jinjun Chen

ELSEVIER

Copyright © 2012 Elsevier Inc.
All right reserved.

ISBN: 978-0-12-397295-8


Chapter One

Introduction

This book presents a novel probabilistic temporal framework to address the limitations of conventional temporal research and the new challenges for lifecycle support of time-constrained e-science applications in cloud workflow systems. The novel research reported in this book is concerned with the investigation of how to deliver high temporal Quality of Service (QoS) from the perspective of cost-effectiveness, especially at workflow run-time. A set of new concepts, innovative strategies and algorithms are designed and developed to support temporal QoS over the whole lifecycles of scientific cloud workflow applications. Case study, comparisons, quantitative evaluations and/or theoretical proofs are conducted for each component of the temporal framework. This would demonstrate that with our new concepts, innovative strategies and algorithms, we can significantly improve the overall temporal QoS in scientific cloud workflow systems.

This chapter introduces the background, motivations and key issues of this research. It is organised as follows. Section 1.1 gives a brief introduction to temporal QoS in cloud workflow systems. Section 1.2 presents a motivating example from a scientific application area. Section 1.3 outlines the key issues of this research. Finally, Section 1.4 presents an overview of the remainder of this book.

1.1 Temporal QoS in Scientific Cloud Workflow Systems

Cloud computing is a latest market-oriented computing paradigm. Gartner estimated the revenue of the Worldwide cloud services is $58.6 billion in 2009, and it is forecast to reach $68.3 billion in 2010, and projected to reach $148.8 billion in 2014. International governments such as the United States, the United Kingdom, Canada, Australian and New Zealand governments take cloud services as an opportunity to improve business outcomes through eliminating redundancy, increasing agility and providing ICT services at a potentially cheaper cost. Cloud refers to a variety of services available on the Internet that deliver computing functionality on the service provider's infrastructure. A cloud is a pool of virtualised computer resources and may actually be hosted on such as grid or utility computing environments. It has many potential advantages which include the ability to scale to meet changing user demands quickly; separation of infrastructure maintenance duties from users; location of infrastructure in areas with lower costs for real estate and electricity; sharing of peak-load capacity among a large pool of users and so on. Given the recent popularity of cloud computing, and more importantly the appealing applicability of cloud computing to the scenario of data and computation intensive scientific workflow applications, there is an increasing demand to investigate scientific cloud workflow systems. For example, scientific cloud workflow systems can support many complex e-science applications such as climate modelling, earthquake modelling, weather forecasting, disaster recovery simulation, astrophysics and high energy physics. These scientific processes can be modelled or redesigned as scientific cloud workflow specifications (consisting of such things as workflow task definitions, process structures and QoS constraints) at the build-time modelling stage. The specifications may contain a large number of computation and data-intensive activities and their non-functional requirements such as QoS constraints on time and cost. Then, at the run-time execution stage, with the support of cloud workflow execution functionalities, such as workflow scheduling, load balancing and temporal verification, cloud workflow instances are executed by employing the supercomputing and data-sharing ability of the underlying cloud computing infrastructures with satisfactory QoS.

One of the research issues for cloud workflow systems is how to deliver high QoS. QoS is of great significance to stakeholders, namely service users and providers. On one hand, low QoS may result in dissatisfaction and even investment loss of service users; on the other hand, low QoS may risk the service providers of out-of-business since it decreases the loyalty of service users. QoS requirements are usually specified as quantitative or qualitative QoS constraints in cloud workflow specifications. Generally speaking, the major workflow QoS constraints include five dimensions, namely time, cost, fidelity, reliability and security. Among them, time, as one of the most general QoS constraints and basic measurements for system performance, attracts many researchers and practitioners. For example, a daily weather forecast scientific cloud workflow, which deals with the collection and processing of large volumes of meteorological data, has to be finished before the broadcasting of a weather forecast programme every day at, for instance, 6:00 p.m. Clearly, if the execution time of workflow applications exceeds their temporal constraints, the consequence will usually be unacceptable to all stakeholders. To ensure on-time completion of these workflow applications, sophisticated strategies need to be designed to support high temporal QoS in scientific cloud workflow systems.

At present, the main tools for workflow temporal QoS support are temporal checkpoint selection and temporal verification which deal with the monitoring of workflow execution against specific temporal constraints and the detection of temporal violations. However, to deliver high temporal QoS in scientific cloud workflow systems, a comprehensive temporal framework which can support the whole lifecycles, viz. from build-time modelling stage to run-time execution stage, of time-constrained scientific cloud workflow applications needs to be fully investigated.

1.2 Motivating Example and Problem Analysis

In this section, we present an example in Astrophysics to analyse the problem for temporal QoS support in scientific cloud workflow systems.

1.2.1 Motivating Example

The Parkes Radio Telescope (parkes.atnf.csiro.au/), one of the most famous radio telescopes in the world, serves institutions around the world. The Swinburne Astrophysics group (astronomy.swinburne.edu.au/) has been conducting pulsar searching surveys (astronomy.swin.edu.au/pulsar/) based on the observation data from the Parkes Radio Telescope. The Parkes Multibeam Pulsar Survey is one of the most successful pulsar surveys to date. The pulsar searching process is a typical scientific workflow which involves a large number of data- and computation-intensive activities. For a single searching process, the average data volume (not including the raw stream data from the telescope) is over 4 terabytes and the average execution time is about 23 hours on Swinburne's high-performance supercomputing facility (astronomy.swinburne.edu.au/ supercomputing/).

For the convenience of discussion, as depicted in Figure 1.1, we illustrate only the high-level workflow structure and focus on one path out of the total of 13 parallel paths for different beams (the other parallel paths are of similar nature and denoted by cloud symbols). The average durations (normally with large variances) for high-level activities (those with sub-processes underneath) and three temporal constraints are also presented for illustration. Given the running schedules of Swinburne's supercomputers and the observing schedules for the Parkes Radio Telescope (parkes.atnf.csiro.au/observing/schedules/), an entire pulsar searching process, i.e. a workflow instance, should normally be completed in 1 day, i.e. an overall temporal constraint of 24 hours, denoted as U(SW) in Figure 1.1.

Generally speaking, there are three main steps in the pulsar searching workflow. The first step is data collection (about 1 hour), data extraction and transfer (about 1.5 hours). Data from the Parkes Radio Telescope streams at a rate of 1 gigabit per second and different beam files are extracted and transferred via gigabit optical fibre. The second step is data pre-processing. The beam files contain the pulsar signals which are dispersed by the interstellar medium. De-dispersion is used to counteract this effect. A large number of de-dispersion files are generated according to different choices of trial dispersions. In this scenario, 1,200 is the minimum number of dispersion trials and normally takes 13 hours to complete. For more dispersion trials, such as 2,400 and 3,600, either longer execution time is required or more computing resources need to be allocated. Furthermore, for binary pulsar searching, every de-dispersion file needs to undergo an Accelerate process. Each de-dispersion file generates several accelerated de-dispersion files and the whole process takes around 1.5 hours. For instance, if we assume that the path with 1,200 de-dispersion files is chosen, then a temporal constraint of 15.25 hours, denoted as U(SW1), usually should be assigned for the data pre-processing step. The third step is pulsar seeking. Given the generated de-dispersion files, different seeking algorithms can be applied to search for pulsar candidates, such as FFT Seek, FFA Seek and single Pulse Seek. For the instance of 1,200 de-dispersion files, it takes around 1 hour for FFT seeking to seek all the candidates. Furthermore, by comparing the candidates generated from different beam files in the same time session (around 20 minutes), some interference may be detected and some candidates may be eliminated (around 10 minutes). With the final pulsar candidates, the original beam files are retrieved to find their feature signals and fold them to XML files. The Fold to XML activity usually takes around 4 hours. Here, a temporal constraint of 5.75 hours, denoted as U(SW2), usually should be assigned. Finally, the XML files will be inspected manually (by human experts) or automatically (by software) to facilitate the decision making on whether a possible pulsar has been found or not (around 20 minutes).

1.2.2 Problem Analysis

Based on the above example, we can see that to guarantee on-time completion of the entire workflow process, the following problems need to be addressed.

1. Setting temporal constraints. Given the running schedules, a global temporal constraint, i.e. the deadline U(SW) of 24 hours, needs to be assigned. Based on that, with the estimated durations of workflow activities, two other coarse-grained temporal constraints U(SW1 and U(SW2) are assigned as 15.25 and 5.75 hours, respectively. These coarse-grained temporal constraints for local workflow segments can be defined based on the experiences of service providers (e.g. the estimated activity durations) or the requirements of service users (e.g. QoS requirements). However, fine-grained temporal constraints for individual workflow activities, especially those with long durations such as De-dispersion and seeking algorithms, are also required in order to support fine-grained control of workflow executions. Meanwhile, the relationship between the high-level coarse-grained temporal constraints and their low-level fine-grained temporal constraints should be investigated so as to keep the consistency between them.

Therefore, a temporal constraint setting component should be able to facilitate the setting of both coarse-grained and fine-grained temporal constraints in scientific cloud workflow systems. Meanwhile, as the prerequisite, an effective forecasting strategy for scientific cloud workflow activity durations is also required to ensure the accuracy of the setting results.

(Continues...)



Excerpted from Temporal QoS Management in Scientific Cloud Workflow Systems by Xiao Liu Yun Yang Jinjun Chen Copyright © 2012 by Elsevier Inc.. Excerpted by permission of ELSEVIER. All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

Table of Contents

1. Introduction 2. Literature Review and Problem Analysis 3. A Scientific Cloud Workflow System 4. Novel Probabilistic Temporal Framework 5. Forecasting Scientific Cloud Workflow Activity Duration Intervals 6. Temporal Constraint Setting 7. Temporal Checkpoint Selection and Temporal Verification 8. Temporal Violation Handling Point Selection 9. Temporal Violation Handling 10. Conclusions and Contribution Bibliography Appendix: Notation Index

What People are Saying About This

From the Publisher

Offers a comprehensive framework to support the lifecycle of time-constrained workflow applications

From the B&N Reads Blog

Customer Reviews