Eventual consistency gives scalability and availability
Success of high-volume web sites such as Google, Amazon, Twitter and Facebook in using NoSQL to achieve massive parallelism, unlimited scalability and high availability has fueled the interest. Consistency – the basic feature of relational database – is no longer the key. Trading off on consistency enables higher levels of scalability and availability and the new generation websites are willing to do so. Amazon claims that just an extra one tenth of a second on their response times will cost them 1% in sales. Google said they noticed that just a half a second increase in latency caused traffic to drop by a fifth.
The trend can be understood by looking at what Amazon CTO says “Each node in a system should be able to make decisions purely based on local state. If you need to do something under high load with failures occurring and you need to reach agreement, you’re lost. If you’re concerned about scalability, any algorithm that forces you to run agreement will eventually become your bottleneck. Take that as a given.”
To be fault tolerant and provide concurrency to millions of users, the data is duplicated in multiple copies. This brings an issue of how to make them consistent. As CAP theorem states, the more you relax your consistency requirement, the more you can gain on availability and partition tolerance. Consistency levels at a broad level can be classified as:
- Strong consistency: After the update completes any subsequent access will return the updated value (strict consistency typically using one copy serializability as if there is only one copy).
- Weak consistency: The system does not guarantee that subsequent accesses will return the updated value. A number of conditions need to be met before the value will be returned.
- Eventual consistency: The storage system guarantees that if no new updates are made to the object eventually (after the inconsistency window closes) all accesses will return the last updated value.
Relational databases meant ACID properties (Atomicity, Consistency, Isolation and Durability). ACID are pessimistic and forces consistency at the end of every operation. ACID, though seem indispensable, is incompatible with availability and performance in very large systems. And distributed data stores of NoSQL do not attempt to provide ACID guarantees. Instead they adopt an alternate architectural approach known as BASE – Basically Available, Soft-state, Eventually consistent – which is the logical opposite of ACID.
Most NoSQL databases resort to Eventual consistency and there are variations in achieving the same:
- Read-your-writes consistency: This allows the client to see his own update immediately (and the client can switch server between requests), but not the updates made by other clients. The effect of a write operation by a process on data item x will always be seen by a successive read operation on x by the same process.
- Session consistency: Provide the read-your-write consistency only when the client is issuing the request under the same session scope (which is usually bind to the same server). This is a more practical version of the read-your-write consistency as it limits the guarantee in the context of a session.
- Monotonic Read Consistency: This provide the time monotonicity (a function which preserves the given order) guarantee that the client will only see more updated version of the data in subsequent accesses. If a process reads the value of a data item x, any successive read operation on x by that process will always return that same value or a more recent one.
- Monotonic write consistency. In this case the system guarantees to serialize the writes by the same process. A write operation by a process on a data item x is completed before any successive write operation on x by the same process. The writes are propagated to all replicas in the correct order (similar to FIFO). If the system does not guarantee this level of consistency, then it becomes very hard for programmers. Most NoSQL databases do provide this level of consistency.
An excellent presentation on Consistency and Replication (from University of Pennsylvania) with examples is available at http://www.cis.upenn.edu/~lee/07cis505/Lec/lec-ch7b-replication-v3.pdf.
The key to NoSQL adoption is the mindset change in terms of accepting trade-off in consistency to achieve availability and scalability, approximate results are okay approximate answers are “okay” and users willing to take control.