Choosing the right NoSQL database for the job: a quality attribute evaluation

Journal of Big Data, Aug 2015

For over forty years, relational databases have been the leading model for data storage, retrieval and management. However, due to increasing needs for scalability and performance, alternative systems have emerged, namely NoSQL technology. The rising interest in NoSQL technology, as well as the growth in the number of use case scenarios, over the last few years resulted in an increasing number of evaluations and comparisons among competing NoSQL technologies. While most research work mostly focuses on performance evaluation using standard benchmarks, it is important to notice that the architecture of real world systems is not only driven by performance requirements, but has to comprehensively include many other quality attribute requirements. Software quality attributes form the basis from which software engineers and architects develop software and make design decisions. Yet, there has been no quality attribute focused survey or classification of NoSQL databases where databases are compared with regards to their suitability for quality attributes common on the design of enterprise systems. To fill this gap, and aid software engineers and architects, in this article, we survey and create a concise and up-to-date comparison of NoSQL engines, identifying their most beneficial use case scenarios from the software engineer point of view and the quality attributes that each of them is most suited to.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://link.springer.com/content/pdf/10.1186%2Fs40537-015-0025-0.pdf

Choosing the right NoSQL database for the job: a quality attribute evaluation

Lourenço et al. Journal of Big Data Choosing the right NoSQL database for the job: a quality attribute evaluation João Ricardo Lourenço Bruno Cabral Paulo Carreiro Marco Vieira Jorge Bernardino For over forty years, relational databases have been the leading model for data storage, retrieval and management. However, due to increasing needs for scalability and performance, alternative systems have emerged, namely NoSQL technology. The rising interest in NoSQL technology, as well as the growth in the number of use case scenarios, over the last few years resulted in an increasing number of evaluations and comparisons among competing NoSQL technologies. While most research work mostly focuses on performance evaluation using standard benchmarks, it is important to notice that the architecture of real world systems is not only driven by performance requirements, but has to comprehensively include many other quality attribute requirements. Software quality attributes form the basis from which software engineers and architects develop software and make design decisions. Yet, there has been no quality attribute focused survey or classification of NoSQL databases where databases are compared with regards to their suitability for quality attributes common on the design of enterprise systems. To fill this gap, and aid software engineers and architects, in this article, we survey and create a concise and up-to-date comparison of NoSQL engines, identifying their most beneficial use case scenarios from the software engineer point of view and the quality attributes that each of them is most suited to. NoSQL databases; Key-value; Document store; Columnar; Graph; Software engineering; Quality attributes; Software architecture Introduction Relational databases have been the stronghold of modern computing applications for decades. ACID properties (Atomicity, Consistency, Isolation, Durability) made relational databases the solution for almost all data management systems. However, the need to handle data in web-scale systems [ 1–3 ], in particular Big Data systems [ 4 ], have led to the creation of numerous NoSQL databases. The term NoSQL was first coined in 1988 to name a relational database that did not have a SQL (Structured Query Language) interface [ 5 ]. It was then brought back in 2009 for naming an event which highlighted new non-relational databases, such as BigTable [ 3 ] and Dynamo [ 6 ], and has since been used without an “official” definition. Generally speaking, a NoSQL database is one that uses a different approach to data storage and access when compared with relational database management systems [ 7, 8 ]. NoSQL databases lose the support for ACID transactions as a trade-off for increased availability and scalability [ 1, 7 ]. Brewer created the term BASE for these systems - they are Basically Available, have a Soft state (during which they are not yet consistent), and are Eventually consistent, as opposed to ACID systems [ 9 ]. This BASE model forfeits the essential ACID properties of consistency and isolation in order to favor “availability, graceful degradation, and performance” [ 9 ]. While originally the term stood for “No SQL”, it has recently been restated as “Not Only SQL” [ 1, 7, 10 ] to highlight that these systems rarely fully drop the relational model. Thus, in spite of being a recurrent theme in literature, NoSQL is a very broad term, encompassing very distinct database systems. There are hundreds of readily available NoSQL databases, and each have different use case scenarios [ 11 ]. They are usually divided in four categories [ 2, 7, 12 ], according to their data model and storage: Key-Value Stores, Document Stores, Column Stores and Graph databases. This classification is due to the fact that each kind of database offers different solutions for specific contexts. The “one size fits all” approach of relational databases no longer applies. There has been extensive research in the comparison of relational and non-relational databases in terms of their performance for different applications. However, when developing enterprise systems, performance is only one of many quality attributes to be considered. Unfortunately, there has not yet been a comprehensive assessment of NoSQL technology in what concerns software quality attributes. The goal of this article is to fill this gap, by clearly identifying which NoSQL databases better promote the several quality attributes, thus becoming a reference for software engineers and architects. This article is a revised and extended version of our WorldCIST 2015 paper [ 13 ]. It improves and complements the former in the following aspects: • Three more quality attributes (Consistency, Robustness and Maintainability) were evaluated. • A new section describing the evaluated NoSQL databases was introduced. • The state of the art was extended to provide more up to date and thorough information. • All of the previously evaluated qual (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1186%2Fs40537-015-0025-0.pdf

João Ricardo Lourenço, Bruno Cabral, Paulo Carreiro, Marco Vieira, Jorge Bernardino. Choosing the right NoSQL database for the job: a quality attribute evaluation, Journal of Big Data, 2015, pp. 18, Volume 2, Issue 1, DOI: 10.1186/s40537-015-0025-0