Evaluating cloud database migration options using workload models

Journal of Cloud Computing, Mar 2018

A key challenge in porting enterprise software systems to the cloud is the migration of their database. Choosing a cloud provider and service option (e.g., a database-as-a-service or a manually configured set of virtual machines) typically requires the estimation of the cost and migration duration for each considered option. Many organisations also require this information for budgeting and planning purposes. Existing cloud migration research focuses on the software components, and therefore does not address this need. We introduce a two-stage approach which accurately estimates the migration cost, migration duration and cloud running costs of relational databases. The first stage of our approach obtains workload and structure models of the database to be migrated from database logs and the database schema. The second stage performs a discrete-event simulation using these models to obtain the cost and duration estimates. We implemented software tools that automate both stages of our approach. An extensive evaluation compares the estimates from our approach against results from real-world cloud database migrations.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://link.springer.com/content/pdf/10.1186%2Fs13677-018-0108-5.pdf

Evaluating cloud database migration options using workload models

Ellison et al. Journal of Cloud Computing: Advances, Systems and Applications Evaluating cloud database migration options using workload models Martyn Ellison Radu Calinescu Richard F. Paige 0 Department of Computer Science, University of York , Deramore Lane, York , UK A key challenge in porting enterprise software systems to the cloud is the migration of their database. Choosing a cloud provider and service option (e.g., a database-as-a-service or a manually configured set of virtual machines) typically requires the estimation of the cost and migration duration for each considered option. Many organisations also require this information for budgeting and planning purposes. Existing cloud migration research focuses on the software components, and therefore does not address this need. We introduce a two-stage approach which accurately estimates the migration cost, migration duration and cloud running costs of relational databases. The first stage of our approach obtains workload and structure models of the database to be migrated from database logs and the database schema. The second stage performs a discrete-event simulation using these models to obtain the cost and duration estimates. We implemented software tools that automate both stages of our approach. An extensive evaluation compares the estimates from our approach against results from real-world cloud database migrations. Database modelling; Cloud migration; Enterprise systems; Model-driven engineering Introduction The benefits of hosting an enterprise system on the cloud — instead of on-premise physical servers — are well understood and documented [ 1 ]. Some organisations have been using clouds for over a decade and are considering switching provider [ 2 ], while others are planning an initial migration [ 3 ]. In either case, the most challenging component to migrate is often the database due to the size and importance of the data it contains. However, the existing cloud migration work focuses on the software components and gives minimal consideration to data. For instance, the ARTIST [ 4 ] and REMICS [ 5 ] cloud migration methodologies refer to the database but do not support any database specific challenges. Similarly, cloud deployment simulators like CDOSim [ 6 ] focus only on compute resources. The limitations of these existing cloud migration methodologies are described further in “Related work” section. Migrating large relational databases from physical infrastructure into the cloud presents many significant challenges, e.g., managing system downtime, choosing suitable cloud instances, and choosing a cloud provider. The database could be deployed on a database-as-aservice offered by one of several public cloud providers, or installed and configured on a virtual machine(s). With either option, selecting the appropriate cloud resources requires knowledge of the database workload and size. The infrastructure of the source database may impact the migration duration; if it has limited available capacity or bandwidth, then it will take longer to extract the data. An organisation may wish to upgrade the existing database hardware to speed up migration, or schedule downtime to migrate the database while it is idle. In this work, we assist with this decision-making process via a tool-supported approach for evaluating cloud database migration options. Our approach has two stages—database workload and structure modelling, and database migration simulation—and estimates migration duration, migration costs, and future cloud running costs.We assume the source and target databases have an identical: schema, type (e.g., relational or NoSQL), vendor (e.g., Oracle or MySQL), and software version. Changing any of these parameters is a complex activity, which organisations tend to perform separately (as discussed in “Approach overview” section). Given logs and a schema of a candidate database, the database modelling stage generates: (i) a workload model conforming with the Structured Metrics Metamodel (SMM) [ 7 ], and (ii) a structure model conforming with the Knowledge Discovery Metamodel (KDM) [ 8 ]. The second stage of the approach uses these models, alongside a cost model of the target cloud platform, to perform a discrete-event simulation of the database migration and deployment. To ease the adoption of the new approach, we implemented two software tools that automate the main tasks. We carried out an extensive evaluation of the approach using several open-source enterprise applications, and a closed-source system from our industrial project partner Science Warehouse [ 9 ]. In particular, our database modelling method and tool were applied to 15 systems (including Apache OFBiz, and MediaWiki) to obtain workload and structure models. In each case, the system was installed on a server and configured with an Oracle or MySQL database. The experimental results (detailed later in the paper) show that our tool can extract models from a (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1186%2Fs13677-018-0108-5.pdf

Martyn Ellison, Radu Calinescu, Richard F. Paige. Evaluating cloud database migration options using workload models, Journal of Cloud Computing, 2018, pp. 6, Volume 7, Issue 1, DOI: 10.1186/s13677-018-0108-5