[SysDes1][Chap. 5] Design Consistent Hashing

What is consistent hashing?

The rehashing problem
A usual way to partition hash values is to do module operation. For example, partition index = hash value % #partitions. But if we increase the partition number, e.g. increase servers to serve traffic or to store data, the same hash value will now be assigned to a different partition.

Consistent hashing is a technique when a hash table is resized, only a proportional set of keys need to be remapped instead of remapping all the keys.Consistent hashing is a commonly used technique to distribute requests/data efficiently and evenly in horizontal scaling.

Design

Each hash function corresponds to a hash space. List all hash values and connect them as a ring. Use the hash function to hash servers onto the ring. This partition the hash space into different parts. 

How to make data balanced?

Use virtual nodes to represent the real node. This help balance the amount of data in each partition.

How to add & remove server

To add a  server, you need to redistribute adjacent partition’s data to the new server.

To remove a server, you need to move the data from the old server to the adjacent partition.

Leave a Reply

Your email address will not be published. Required fields are marked *