More data than ever before is being created, distributed and harnessed to make business decisions. In 2013, just to give you an idea how much more data than ever before, IBM said that 90% of the world’s data had been created in the last 2 years alone.
In this blog post, we will take a look at the evolution of databases and some of the reasons that relational databases are becoming less and less common and NoSQL databases are growing in popularity. We’ll look at advantages and disadvantages of relational databases and NoSQL, the differences between the two and cover the 4 different types of NoSQL databases that are used.
The boom in unstructured data that the world has seen in the last few years is one of the main reasons relational databases are no longer sufficient for many companies’ needs. One reason we have seen this boom in unstructured data is the global ease of access to the Internet. Contributing to this boom is the ubiquity of social media, wherein everybody wants to let others know happenings related to them as and when they are taking place. As more than 1/5th of the population is following such behavioral patterns, we can see that not only will data storage and fetching requirements become hugely important but simultaneously this also requires increased storage for various types of data like audio, video, images and textual data.
Traditionally we have been dependent upon the relational database management systems (RDBMS) for handling storage requirements in the IT World. Enormous amounts of data are created every day on the web via web and business applications and a large section of this data is handled by relational databases. Beyond a lot of intended benefits, the relational model is well-suited to client-server programming and today it is the predominant technology for storing structured data in web and business applications. Classical relational databases follow the ACID property. That is, a database transaction must be Atomic, Consistent, Isolated and Durable. The details of ACID are as follows:
Apart from these ACID properties, there are some basic characteristics due to which Relational DBMS become popular. Some of them are:
Shortcomings of RDBMS
RDBMS is sufficient to store and manipulate all the structured data efficiently but in today’s world the velocity and nature of data used/generated over the Internet is growing exponentially. As we can often see in areas like social media, the data used has no specific structure boundary. This makes unavoidable the need to handle unstructured data which is non-relational and schema-less in nature. For RDBMS it becomes a real challenge to provide the cost effective and fast Create, Read, Update and Delete (CRUD) operation as it has to deal with the overhead of joins and maintaining relationships amongst various data.
Therefore a new mechanism is required to deal with such data in an easy and efficient way. This is where NoSQL comes into the picture to handle unstructured BIG data in an efficient way to provide maximum business value and customer satisfaction.
NoSQL is not a campaign against the SQL language. NoSQL stands for “Not Only SQL.” It provides more possibilities beyond the classic relational approach of data persistence to the developers.
NoSQL refers to a broad class of non-relational databases that differ from classical RDBMS in some significant aspects, most notably because they do not use SQL as their primary query language, instead providing access by means of Application Programming Interfaces (APIs).
The reason behind such a big switch or in other words the advantages of NoSQL are the following:
As RDBMS follows the ACID property, NoSQL databases are “BASE” Systems. The BASE acronym was defined by Eric Brewer, who is also known for formulating the CAP theorem whose properties are used by BASE System.
The CAP theorem states that a distributed computer system cannot guarantee all of the following three properties at the same time:
Brewer originally described this impossibility result as forcing a choice of “two out of the three” CAP properties, leaving three viable design options: CP, AP and CA. All the three combinations can be defined as:
A BASE system gives up on consistency so as to have greater Availability and Partition tolerance. A BASE can be defined as following:
Types of NoSQL
These days we are having about four types of NoSQL database available:
Now we can easily differentiate between NoSQL and RDBMS:
A very nice video by Martin Fowler: