Designing a Distributed Unique ID Generator — Twitter Snowflake (Java Implementation)

2024.06.12 ·#system-design #Snowflake #distributed-systems #unique-ID #Java #concurrency

📚 Book: System Design Interview Vol. 1 — by Alex Xu. This post is a study note on Chapter 7.

Intro — The Problem

A relatively large traffic load occurs and must be handled. (Assume we need to insert 10,000 IDs per second.)

Handling 10,000 RPS (TPS) on a single server's application can cause overload. So we cluster multiple servers with the same function to boost performance and distribute requests across several application servers.

If each application sends inserts to the DB and the ID climbs by Auto-Increment, a race condition can cause ID collisions (assume the transaction isolation level isn't strict, for performance). Chapter 7 covers a generator that produces a unique ID even when different servers generate IDs.

The ID generator's requirements:

Unique ID
Numeric type
The ID must be 64 bits (within 16 hex characters)
IDs must be sortable by date of issuance
It must handle 10,000 TPS

Approach — Snowflake, among four options

The book presents four options: UUID, multi-master replication, ticket server, and Twitter Snowflake. Each has plenty to discuss, but this post covers the headline Twitter Snowflake technique.

Snowflake structurally eliminates concurrency issues by embedding the elements that could cause them directly into the ID.

Twitter Snowflake's 64-bit ID layout — Sign·TimeStamp·DC ID·Server ID·Serial Num, 5 segments Figure 1. Snowflake ID layout

Snowflake is 64 bits, split into 5 segments. Let's go through each segment and follow how concurrency is solved through ID structure alone.

① Sign — 1 bit : unused for now, reserved for sign (negative/positive) later.
② TimeStamp — 41 bits : milliseconds elapsed since the epoch. Snowflake's epoch is Nov 04, 2010, 01:42:54 UTC = 1288834974657 ms.
③ DataCenter ID — 5 bits : 2⁵ = 32 data centers supported.
④ Server ID — 5 bits : 32 servers per data center. 32 × 32 = 1024 servers total.
⑤ Serial Num — 12 bits : within the same server, if multiple IDs are generated within 1 ms it increments by 1, and resets to 0 once 1 ms passes.

💡 An elegant idea — solve the problem by spelling out every potentially-concurrent element as part of the ID.

Put the time in the ID to distinguish them. But what if multiple data centers generate at the same time? → branch by DC ID.

What if several servers in one data center generate at the same time? → branch by Server ID.

What if the same server generates concurrently at the same time? → branch by Serial Num.

This way the ID generation itself is unique with no need to check DC/Server in application logic. (However, the case where concurrent requests land with the same Serial Num must be handled in the ID Generator.)

Implementation (Java)

A generator that bit-shifts each of the 5 segments into a single 64-bit ID and returns it as a decimal Long.

import java.time.LocalDateTime;
import java.time.ZoneId;

public class SnowFlakeIdGenerator {

    // Twitter's epoch expressed in ms
    private final long twEpoch = 1288834974657L;

    // Layout: TimeStamp 41bit, Server ID 5bit, DC ID 5bit, Serial 12bit
    private final long timestampBits = 41L;
    private final long serverIdBits = 5L;
    private final long datacenterIdBits = 5L;
    private final long serialBits = 12L;

    // Bit shifts — precompute how far left to shift each segment
    private final long serialIdShiftFromRight = serialBits;
    private final long datacenterIdShiftFromRight = serialBits + serverIdBits;
    private final long timestampLeftShiftFromRight = serialBits + serverIdBits + datacenterIdBits;
    private final long signLeftShiftFromRight = serialBits + serverIdBits + datacenterIdBits + timestampBits;

    // Mask to reset serialNum to 0 when it wraps around
    private final long serialMask = -1L ^ (-1L << serialBits);

    private long oldTimestamp = -1L;
    private long serialNum = 1L;

    // synchronized since many requests can hit within the same ms
    public synchronized long generateNewId(long serverId, long dataCenterId) throws Exception {

        long timestamp = getCurrentTimeStamp();   // current time (ms)

        long signedId = 1L;
        long timestampId = timestamp - twEpoch;

        // Increment the serial number
        if (timestamp == oldTimestamp) {
            // reset serialNum to 0 when it wraps
            serialNum = (serialNum + 1) & serialMask;

            // if all 4096 (2^12) are used within 1 ms, wait for the next ms
            if (serialNum == 0) {
                long newTime = getCurrentTimeStamp();
                while (timestamp != newTime) {
                    newTime = getCurrentTimeStamp();
                }
            }
        } else {
            serialNum = 1L;   // reset when the time changes
        }

        oldTimestamp = timestamp;

        // shift the 5 segments and combine into one 64-bit ID
        long snowFlakeId =
                signedId << signLeftShiftFromRight |
                timestampId << timestampLeftShiftFromRight |
                dataCenterId << datacenterIdShiftFromRight |
                serverId << serialIdShiftFromRight |
                serialNum;

        return snowFlakeId;
    }

    private long getCurrentTimeStamp() {
        // unify with ZoneId since it can differ per data center
        ZoneId zoneId = ZoneId.of("Asia/Seoul");
        return LocalDateTime.now().atZone(zoneId).toInstant().toEpochMilli();
    }
}

I verified the result in both bits and decimal. (Setting the leading Sign to 1 makes the decimal representation negative.)

binary  : 1001100011111100110110001011011111001111000000100001000000000001
decimal : -7422819801849786367

I used it by converting the decimal value to a String and putting it into the ID on insert.

Test Environment

Now it's time to expose this ID Generator to a 10,000 TPS environment and verify it inserts without duplicates. I set up the test environment as follows.

Figure 2. ID Generator test architecture

I built a home cloud server and made it reachable via SSH from outside. With 6 PCs at home, the above setup is possible over a private network.

Apache Bench — a performance tester that can send n requests with concurrency.
Nginx — a classic web server, used here as a Reverse Proxy (load balancer).
Nexus Repository — delivers the Id-Generator jar built above to the application.
Netty — instead of Spring Boot's default embedded Tomcat, I used the event-loop Netty server for performance.
MongoDB — the DB. I used the non-blocking mongo-reactive-db ORM.

There's a lot to say about each component, but this post only introduces the test setup. I'll share the performance test results in a follow-up post.

Of the above, installing and version-pinning Nginx is covered in detail in the post below.

Nginx Specific-Version & Latest Install Guide: Ubuntu / CentOS, Fully Organized

Install Nginx on a cloud server like AWS EC2 — version-specific installs on Ubuntu (apt) and CentOS (yum), the /etc/nginx config structure (sites-available/enabled), and firewalls (UFW, firewalld, AWS security groups) all in one.

taystudios.com/blog

📦 Migrated from my own Korean blog (my own writing). Original: taehyuklee.tistory.com/10

Intro — The Problem

Approach — Snowflake, among four options

Implementation (Java)

Test Environment

Related posts

Comments