← back to blog

System Design

Consistent Hashing: Why Every Distributed System Uses It

From theory to production — how consistent hashing solves the rebalancing problem

2026-03-15
distributed-systemshashingload-balancingarchitecture

The Problem with Naive Hashing

You have N servers and need to distribute requests. The obvious approach:

server = hash(key) % N

This works until you add or remove a server. When N changes, almost every key maps to a different server. If you have a cache layer, this means a thundering herd — every cached item becomes a cache miss simultaneously.

Consistent Hashing: The Key Insight

Instead of mapping keys to servers directly, map both keys and servers onto a ring (hash space from 0 to 2³²-1).

graph TB
    subgraph "Hash Ring"
        direction TB
        A["Server A (pos: 1000)"]
        B["Server B (pos: 5000)"]
        C["Server C (pos: 9000)"]
    end
    
    K1["Key 'user:42' → hash: 3200"] --> B
    K2["Key 'session:99' → hash: 7500"] --> C
    K3["Key 'cache:abc' → hash: 800"] --> A

Each key is handled by the first server encountered when walking clockwise from the key's position on the ring.

Why It's Better

When a server is added or removed, only keys between the affected server and its predecessor need to move. On average, only K/N keys are remapped (where K is total keys, N is total servers).

import hashlib
from bisect import bisect_right

class ConsistentHashRing:
    def __init__(self, nodes=None, virtual_nodes=150):
        self.virtual_nodes = virtual_nodes
        self.ring = {}
        self.sorted_keys = []
        
        if nodes:
            for node in nodes:
                self.add_node(node)
    
    def _hash(self, key):
        return int(hashlib.md5(key.encode()).hexdigest(), 16)
    
    def add_node(self, node):
        for i in range(self.virtual_nodes):
            vnode_key = f"{node}:vn{i}"
            hash_val = self._hash(vnode_key)
            self.ring[hash_val] = node
            self.sorted_keys.append(hash_val)
        self.sorted_keys.sort()
    
    def remove_node(self, node):
        for i in range(self.virtual_nodes):
            vnode_key = f"{node}:vn{i}"
            hash_val = self._hash(vnode_key)
            del self.ring[hash_val]
            self.sorted_keys.remove(hash_val)
    
    def get_node(self, key):
        if not self.ring:
            return None
        hash_val = self._hash(key)
        idx = bisect_right(self.sorted_keys, hash_val) % len(self.sorted_keys)
        return self.ring[self.sorted_keys[idx]]

Virtual Nodes: Solving Hotspots

With few physical servers, the distribution on the ring can be uneven. Virtual nodes solve this — each physical server is represented by 100-200 points on the ring, giving much more uniform distribution.

Where You'll See This

  • DynamoDB: Partition key distribution
  • Cassandra: Token ring for data partitioning
  • Memcached: Client-side consistent hashing for cache distribution
  • CDNs: Request routing to edge servers

The next time you're designing a system that needs to scale horizontally, consistent hashing should be your default mental model for key distribution.

← all posts

Ayoush Chourasia · 2026-03-15