article

Cost and fault-tolerant aware resource management for scientific workflows using hybrid instances on clouds

Authors:

Raghavendra S.,

Venugopal K. R.Authors Info & Claims

Multimedia Tools and Applications, Volume 77, Issue 8

Pages 10171 - 10193

https://doi.org/10.1007/s11042-017-5304-7

Published: 01 April 2018 Publication History

Abstract

Cloud service providers are offering computing resources at a reasonable price as a pay-per-use model. Further, cloud service providers have also introduced different pricing models like spot, blockspot and spotfleet instances that are cost effective and user's have to go through the bidding to balance the reliability and monetary costs. Henceforth, Scientific Workflows (SWf) that are used to model applications of high throughput, computation and complex large-scale data analysis are significantly adopting these computing resources. Nevertheless, spot instances are terminated when the market spot price exceeds the users bid price. Moreover, failures are inevitable in such a large distributed systems and often pose a challenge to design a fault-tolerant scheduling algorithm for SWf. This paper presents an efficient, low-cost and fault-tolerant scheduling algorithm and a bidding strategy to minimize the volatility and cost of resource provisioning for SWf. The proposed algorithm uses spot and blockspot instances as hybrid instances in comparison with on-demand instance to reduce the execution cost and fault-tolerant while meeting the SWf deadline. The results obtained reveal the promising potential of the proposed scheduling algorithm and are demonstrated through empirical simulation study that is robust under short deadlines with minimal makespan and cost.

References

[1]

Almi'Ani K, Lee YC (2016) Partitioning-based workflow scheduling in clouds. In: 2016 IEEE 30th international conference on Advanced information networking and applications (AINA). IEEE, Piscataway, pp 645---652

[2]

Bala A, Chana I (2015) Intelligent failure prediction models for scientific workflows. Expert Syst Appl 42(3):980---989

Digital Library

[3]

Calheiros RN, Ranjan R, Beloglazov A, De Rose CAF, Buyya R (2011) Cloudsim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Software: Practice and experience 41 (1):23---50

Digital Library

[4]

Calheiros RN, Buyya Rajkumar (2014) Meeting deadlines of scientific workflows in public clouds with tasks replication. IEEE Trans Parallel Distrib Syst 25(7):1787---1796

Digital Library

[5]

Chen J, Yang Y (2007) Adaptive selection of necessary and sufficient checkpoints for dynamic verification of temporal constraints in grid workflow systems. ACM Transactions on Autonomous and Adaptive Systems (TAAS) 2(2):6

Digital Library

[6]

Chirkin AM, Kovalchuk SV (2014) Towards better workflow execution time estimation. IERI Procedia 10:216---223

[7]

Darbha S, Agrawal DP (1994) A task duplication based optimal scheduling algorithm for variable execution time tasks. In: International conference on parallel processing, 1994. ICPP 1994, vol 2. IEEE, Piscataway, pp 52---56

Digital Library

[8]

Dejun J, Pierre G, Chi C-h (2010) Ec2 performance analysis for resource provisioning of service-oriented applications. In: Service-Oriented computing. ICSOC/ServiceWave 2009 workshops. Springer, Berlin, pp 197---207

Digital Library

[9]

D�az JL, Entrialgo J, Garc�a M, Garc�a J, Garc�a DF (2017) Optimal allocation of virtual machines in multi-cloud environments with reserved and on-demand pricing. Futur Gener Comput Syst 71:129---144

[10]

Hwang S, Kesselman C (2003) Grid workflow: A flexible failure handling framework for the grid. In: 2003. Proceedings. 12th IEEE International Symposium on High Performance Distributed Computing. IEEE, Piscataway, pp 126---137

Digital Library

[11]

Jangjaimon I, Tzeng N-F (2015) Effective cost reduction for elastic clouds under spot instance pricing through adaptive checkpointing. IEEE Trans Comput 64 (2):396---409

Digital Library

[12]

Javadi B, Abawajy J, Buyya R (2012) Failure-aware resource provisioning for hybrid cloud infrastructure. J Parallel Distrib Comput 72(10):1318---1331

Digital Library

[13]

Lifka D, Foster I, Mehringer S, Parashar M, Redfern P, Stewart C, Tuecke S (2013) Xsede cloud survey report. Technical report, National Science Foundation, USA, Tech. Rep.

[14]

Juve G, Chervenak A, Deelman E, Bharathi S, Mehta G, Vahi K (2013) Characterizing and profiling scientific workflows. Futur Gener Comput Syst 29 (3):682---692

Digital Library

[15]

Li J, Humphrey M, Cheah Y-W, Ryu Y, Agarwal D, Jackson K, van Ingen C (2010) Fault tolerance and scaling in e-science cloud applications: Observations from the continuing development of modisazure. In: 2010 IEEE Sixth International Conference on e-Science (e-Science). IEEE, Piscataway, pp 246---253

Digital Library

[16]

Li X, Zhang L, Wu Y, Liu X, Zhu E, Yi H, Wang F, Zhang C, Yang Y (2017) A novel workflow-level data placement strategy for data-sharing scientific cloud workflows. IEEE Trans Serv Comput

[17]

Mehmi S, Verma HK, Sangal AL (2016) Comparative analysis of cloudlet completion time in time and space shared allocation policies during attack on smart grid cloud. Procedia Computer Science 94:435---440

[18]

Plankensteiner K, Prodan R, Fahringer T, Kert�sz A, Kacsuk P (2009) Fault detection, prevention and recovery in current grid workflow systems. In: Grid and services evolution, pp 1---13

[19]

Qu C, Calheiros RN, Buyya R (2016) A reliable and cost-efficient auto-scaling system for web applications using heterogeneous spot instances. J Netw Comput Appl 65:167---180

Digital Library

[20]

Ribas M, Furtado CG, de Souza JN, Barroso GC, Moura A, Lima AS, Sousa FRC (2015) A petri net-based decision-making framework for assessing cloud services adoption The use of spot instances for cost reduction. J Netw Comput Appl 57:102---118

Digital Library

[21]

Rodriguez MA, Buyya R (2014) Deadline based resource provisioningand scheduling algorithm for scientific workflows on clouds. IEEE Transactions on Cloud Computing 2(2):222---235

[22]

Samak T, Gunter D, Goode M, Deelman E, Juve G, Silva F, Vahi K (2012) Failure analysis of distributed scientific workflows executing in the cloud. In: Proceedings of the 8th international conference on network and service management, pp 46---54 international federation for information processing

Digital Library

[23]

Tang X, Li K, Liao G (2014) An effective reliability-driven technique of allocating tasks on heterogeneous cluster systems. Clust Comput 17(4):1413---1425

Digital Library

[24]

Vinay K, Dilip Kumar SM (2016) Auto-scaling for deadline constrained scientific workflows in cloud environment. In: India Conference (INDICON) 2016 IEEE Annual. IEEE, Piscataway, pp 1---6

[25]

Wan J, Zhang R, Gui X, Xu B (2016) Reactive pricing: an adaptive pricing policy for cloud providers to maximize profit. IEEE Trans Netw Serv Manag 13 (4):941---953

Digital Library

[26]

Zhu X, Ji W, Guo H, Zhu D, Yang LT, Liu L (2016) Fault-tolerant scheduling for real-time scientific workflows with elastic resource provisioning in virtualized clouds. IEEE Trans Parallel Distrib Syst 27(12):3501---3517

Digital Library

Cited By

Saxena DSingh A(2022)OFP-TM: an online VM failure prediction and tolerance model towards high availability of cloud computing environmentsThe Journal of Supercomputing10.1007/s11227-021-04235-z78:6(8003-8024)Online publication date: 1-Apr-2022
https://dl.acm.org/doi/10.1007/s11227-021-04235-z

Index Terms

Cost and fault-tolerant aware resource management for scientific workflows using hybrid instances on clouds

Index terms have been assigned to the content through auto-classification.

Recommendations

Comparing FutureGrid, Amazon EC2, and Open Science Grid for Scientific Workflows

Scientists have many computing infrastructures available to conduct their research, including grids and public or private clouds. This article explores the use of these cyberinfrastructures to execute scientific workflows, an important class of ...
Protecting scientific workflows in clouds with an intrusion tolerant system

With the development of cloud computing technology, more and more scientific workflows are delivered to cloud platforms to complete. However, there are many threats in clouds due to the multi‐tenant coexistence. In order to protect scientific workflows in ...
Simplified Resource Provisioning for Workflows in IaaS Clouds
CLOUDCOM '14: Proceedings of the 2014 IEEE 6th International Conference on Cloud Computing Technology and Science

Resource provisioning is an important and complicated problem for scientific workflows in Infrastructure-as-a-service (IaaS) clouds. Scientists are facing the complexities resulting from the diverse cloud offerings, complex workflow structures and ...

Comments

Information & Contributors

Information

Published In

cover image Multimedia Tools and Applications

Multimedia Tools and Applications Volume 77, Issue 8

Apr 2018

1145 pages

ISSN:1380-7501

Issue’s Table of Contents

Copyright © Copyright © 2018 Springer Science+Business Media, LLC, part of Springer Nature.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 April 2018

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Saxena DSingh A(2022)OFP-TM: an online VM failure prediction and tolerance model towards high availability of cloud computing environmentsThe Journal of Supercomputing10.1007/s11227-021-04235-z78:6(8003-8024)Online publication date: 1-Apr-2022
https://dl.acm.org/doi/10.1007/s11227-021-04235-z

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents