Database Sharding – Horizontal Partitioning for massive scalability
Database Sharding (made popular by Google) is a method of horizontal partitioning in a database and the approach is highly scalable and provides improved throughput and overall performance. Database Sharding can be defined as a “shared-nothing” partitioning scheme for large databases across a number of servers each with their own CPU, memory and disk. Simply put, break the database into smaller chunks called “shards” (each becoming a smaller database by itself) and spread those across a number of distributed servers.
The main advantage of Database Sharding approach is improved scalability, growing in a near-linear fashion as more servers are added to the network. Other benefits of having smaller databases are:
- Easier to Manage – smaller databases means that the regular activities like backup, database optimization etc. can be handled independently and in parallel.
- Faster – Smaller databases mean that the ratio of memory to disk is improved resulting in reduced disk I/O, less contention for resources and in turn the performance as well as throughput improves.
- Less Cost – Lower cost open source databases which doesn’t require licensing per server results in direct cost savings. Also unlike traditional partitioning techniques which rely on shared facilities (like expensive SANs), sharding typically involves commodity hardware which again translates to significant cost savings.
Sharding typically uses a distributed hash table (DHT) that provides a lookup service similar to a hash table where any participating node an efficiently retrieve the value associated with a given key.
The characteristics emphasized by DHT are Decentralization (without any central co-ordination), Scalability (function efficiently with thousands of nodes), Fault tolerance (reliable even with nodes continuously joining, leaving and failing). These goals are achieved using a key technique so that any one node needs to coordinate with only a few other nodes in the system – so that only a limited amount of work needs to be done for each change in membership. This allows a DHT to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.
Most DHTs use consistent hashing to map keys to nodes. This technique employs a function δ(k1,k2) which defines an abstract notion of the distance from key k1 to key k2. Each node is assigned a single key called its identifier (ID). A node with ID ix owns all the keys km for which ix is the closest ID, measured according to δ(km,in). Consistent hashing has the property that removal or addition of one node changes only the set of keys owned by the nodes with adjacent IDs, and leaves all other nodes unaffected.
Conceptually, Sharding broadly falls under three categories:
1. Vertical partitioning – All the data related to a specific feature of a product are stored on the same machines. Storing infrequently used or very wide columns on a physically different device is an example. It is also referred to as “row splitting” – as the row is split by its columns.
2. Key-based partitioning – In this, the part of the data itself is used to do the partitioning. Most common approach is to use a one-way hashing algorithm to map the data to be accessed to one of the shards that store it. Natural hashes can as well be used – in case of numbers as key, the key mod N (number of shards), in case of dates, it could be based on time interval, could be based on first letter of the name, and for amounts it could be based on the range of value. Similarly a list of values can be used to assign a partition – e.g., list of countries grouped into continents.
3. Directory-based partitioning – In this scheme, a lookup table that keeps track of which data is stored in which shard is maintained in the cluster. This approach has two drawbacks – the directory can become a single point of failure and there is a performance overhead as the directory has to be accessed every time to locate the shard.
Composite partitioning that allows a combination of the partitioning schemes are also used, say applying a range partitioning first and then a hash partitioning.
The distributed nature of multiple shard databases increases the criticality of a well-designed fault-tolerant and reliable approach which makes the following necessary:
- Database Shard redundancy – ensuring at least two “live” copies of each shard are available in the event of an outage or server failure. It going without saying that multiple copies in turn require a high-performance, efficient and reliable replication.
- Automated backups of individual database shards
- Redundancy at various levels, involving cost-effective hardware
- Automated failover when an outage or server failure occurs
Distributed queries, that can perform faster and use parallel processing of interim results on each shard server, need the ability of the system to handle them in a seamless manner for the application (MapReduce is one such example).
Various Sharding schemes exist and each has inherent characteristics and performance advantages when applied to a specific problem domain. Database Sharding to be effective needs to be application specific. A single application can use more than one shard scheme, each applied to a specific portion of the application to achieve optimum results.