Success of high-volume web sites such as Google, Amazon, Twitter and Facebook in using NoSQL to achieve massive parallelism, unlimited scalability and high availability has fueled the interest. Consistency – the basic feature of relational database – is no longer the key. Trading off on consistency enables higher levels of scalability and availability and the new generation websites are willing to do so. Amazon claims that just an extra one tenth of a second on their response times will cost them 1% in sales. Google said they noticed that just a half a second increase in latency caused traffic to drop by a fifth.
The trend can be understood by looking at what Amazon CTO says “Each node in a system should be able to make decisions purely based on local state. If you need to do something under high load with failures occurring and you need to reach agreement, you’re lost. If you’re concerned about scalability, any algorithm that forces you to run agreement will eventually become your bottleneck. Take that as a given.”
To be fault tolerant and provide concurrency to millions of users, the data is duplicated in multiple copies. This brings an issue of how to make them consistent. As CAP theorem states, the more you relax your consistency requirement, the more you can gain on availability and partition tolerance. Consistency levels at a broad level can be classified as:
Relational databases meant ACID properties (Atomicity, Consistency, Isolation and Durability). ACID are pessimistic and forces consistency at the end of every operation. ACID, though seem indispensable, is incompatible with availability and performance in very large systems. And distributed data stores of NoSQL do not attempt to provide ACID guarantees. Instead they adopt an alternate architectural approach known as BASE – Basically Available, Soft-state, Eventually consistent – which is the logical opposite of ACID.
Most NoSQL databases resort to Eventual consistency and there are variations in achieving the same:
An excellent presentation on Consistency and Replication (from University of Pennsylvania) with examples is available at http://www.cis.upenn.edu/~lee/07cis505/Lec/lec-ch7b-replication-v3.pdf.
The key to NoSQL adoption is the mindset change in terms of accepting trade-off in consistency to achieve availability and scalability, approximate results are okay approximate answers are “okay” and users willing to take control.]]>