Choosing the right NoSQL database for the job: a quality attribute evaluation
Lourenço et al. Journal of Big Data
Choosing the right NoSQL database for the job: a quality attribute evaluation
João Ricardo Lourenço
Bruno Cabral
Paulo Carreiro
Marco Vieira
Jorge Bernardino
For over forty years, relational databases have been the leading model for data storage, retrieval and management. However, due to increasing needs for scalability and performance, alternative systems have emerged, namely NoSQL technology. The rising interest in NoSQL technology, as well as the growth in the number of use case scenarios, over the last few years resulted in an increasing number of evaluations and comparisons among competing NoSQL technologies. While most research work mostly focuses on performance evaluation using standard benchmarks, it is important to notice that the architecture of real world systems is not only driven by performance requirements, but has to comprehensively include many other quality attribute requirements. Software quality attributes form the basis from which software engineers and architects develop software and make design decisions. Yet, there has been no quality attribute focused survey or classification of NoSQL databases where databases are compared with regards to their suitability for quality attributes common on the design of enterprise systems. To fill this gap, and aid software engineers and architects, in this article, we survey and create a concise and up-to-date comparison of NoSQL engines, identifying their most beneficial use case scenarios from the software engineer point of view and the quality attributes that each of them is most suited to.
NoSQL databases; Key-value; Document store; Columnar; Graph; Software engineering; Quality attributes; Software architecture
Introduction
Relational databases have been the stronghold of modern computing applications for
decades. ACID properties (Atomicity, Consistency, Isolation, Durability) made relational
databases the solution for almost all data management systems. However, the need to
handle data in web-scale systems [
1–3
], in particular Big Data systems [
4
], have led to the
creation of numerous NoSQL databases.
The term NoSQL was first coined in 1988 to name a relational database that did not
have a SQL (Structured Query Language) interface [
5
]. It was then brought back in 2009
for naming an event which highlighted new non-relational databases, such as BigTable [
3
]
and Dynamo [
6
], and has since been used without an “official” definition. Generally
speaking, a NoSQL database is one that uses a different approach to data storage and access
when compared with relational database management systems [
7, 8
]. NoSQL databases
lose the support for ACID transactions as a trade-off for increased availability and
scalability [
1, 7
]. Brewer created the term BASE for these systems - they are Basically Available,
have a Soft state (during which they are not yet consistent), and are Eventually consistent,
as opposed to ACID systems [
9
]. This BASE model forfeits the essential ACID properties
of consistency and isolation in order to favor “availability, graceful degradation, and
performance” [
9
]. While originally the term stood for “No SQL”, it has recently been restated
as “Not Only SQL” [
1, 7, 10
] to highlight that these systems rarely fully drop the relational
model. Thus, in spite of being a recurrent theme in literature, NoSQL is a very broad term,
encompassing very distinct database systems.
There are hundreds of readily available NoSQL databases, and each have different use
case scenarios [
11
]. They are usually divided in four categories [
2, 7, 12
], according to their
data model and storage: Key-Value Stores, Document Stores, Column Stores and Graph
databases. This classification is due to the fact that each kind of database offers different
solutions for specific contexts. The “one size fits all” approach of relational databases no
longer applies.
There has been extensive research in the comparison of relational and non-relational
databases in terms of their performance for different applications. However, when
developing enterprise systems, performance is only one of many quality attributes to be
considered. Unfortunately, there has not yet been a comprehensive assessment of NoSQL
technology in what concerns software quality attributes. The goal of this article is to fill
this gap, by clearly identifying which NoSQL databases better promote the several quality
attributes, thus becoming a reference for software engineers and architects.
This article is a revised and extended version of our WorldCIST 2015 paper [
13
]. It
improves and complements the former in the following aspects:
• Three more quality attributes (Consistency, Robustness and Maintainability) were
evaluated.
• A new section describing the evaluated NoSQL databases was introduced.
• The state of the art was extended to provide more up to date and thorough
information.
• All of the previously evaluated qual (...truncated)