Redis Distributed Systems Primer

ZSET patterns for timers, queues, rate limiters, and more. One data structure — a sorted set — solves most distributed coordination problems once you know what the score means.
1 · The big idea 2 · ZSET internals 3 · Core commands 4 · Pattern: timer / scheduler 5 · Pattern: priority queue 6 · Pattern: rate limiter 7 · Atomicity = your lock 8 · Durability 9 · Scaling 10 · Interview cheat sheet

1 · The big idea

Redis ZSET (sorted set) solves 90% of distributed coordination problems with one data structure. The whole trick is understanding what the score represents — change the meaning of the score and the same primitive becomes a timer, a queue, a rate limiter, or a leaderboard.

Timer / Scheduler

score = fire_timestamp
ZRANGEBYSCORE 0 NOW

Priority Queue

score = priority (inverted)
ZPOPMIN

Rate Limiter

score = request_timestamp
ZCOUNT key (now-window) now

Leaderboard

score = points
ZREVRANGE 0 9

Delayed Queue

score = visible_at
ZRANGEBYSCORE 0 NOW

Sliding Window

score = event_timestamp
ZREMRANGEBYSCORE 0 (now-window)
Interview insight
These "different" problems are the same data structure with different semantics. Recognizing this pattern shows you see underlying structure, not just memorized solutions.

2 · ZSET internals — why a sorted set, not a heap

A ZSET is a skip list + hash table. The dual structure gives you the best of both worlds: the skip list keeps members ordered by score for ranges and ranks, the hash table gives O(1) score lookup by member.

OperationHeapZSET
Get min/maxO(1)O(1)
InsertO(log n)O(log n)
Extract minO(log n)O(log n)
Range query (score 100–200)O(n)O(log n + m)
Get rank of XO(n)O(log n)
Get score of XO(n)O(1)
Update scoreO(n) find + O(log n)O(log n)
Level 3 Level 2 Level 1 1 2 4 6 7 9 1 4 9 1 9 express lanes skip ahead — search drops down a level when the next node overshoots
O(log n) search like a balanced tree · range scans just walk the base list forward (cache friendly) · simpler than a red-black tree and concurrency-friendly

3 · Core commands

ZADD — add member with score

ZADD timers 1708700000 "job:abc123" key score member (fire_at) (timer_id)

Stores the member with its score. If the member already exists, the score is updated in place.

ZRANGEBYSCORE — get members in a score range

ZRANGEBYSCORE timers 0 1708700000 LIMIT 0 100 key min max limit # Returns all members with score between 0 and now # Perfect for "give me all due timers"

ZREM — remove member (atomic)

ZREM timers "job:abc123" key member (NOT the score!) # Returns: 1 if removed, 0 if it didn't exist # This return value is your distributed lock!
Key insight
ZREM is atomic because Redis is single-threaded. Two workers calling ZREM on the same member — only one gets 1. No distributed lock needed.

ZPOPMIN — atomic pop (get + remove)

ZPOPMIN priority_queue 1 key count # Returns AND removes the lowest-scored member # Atomic — perfect for priority queue consumers

Other useful commands

CommandPurpose
ZSCORE key memberGet score of member — O(1)
ZRANK key memberGet rank (position) of member
ZCOUNT key min maxCount members in score range
ZREVRANGE key 0 9Top 10 (highest scores first)
ZREMRANGEBYSCORE key min maxRemove all in score range
ZINCRBY key increment memberAtomically increment score

4 · Pattern: timer / scheduler — score = fire_at

The problem
Schedule callbacks to fire at specific times. Millions of timers, distributed workers, at-least-once delivery.
# Producer: schedule a timer timer_id = f"retention:{evidence_id}" fire_at = time.time() + (7 * 365 * 24 * 3600) # 7 years redis.ZADD("timers", {timer_id: fire_at}) # Store payload separately redis.HSET("timer_payloads", timer_id, json.dumps({ "action": "delete", "evidence_id": evidence_id }))
# Consumer: poll and process while True: now = time.time() # Get due timers (peek, don't remove yet) due = redis.ZRANGEBYSCORE("timers", 0, now, limit=100) for timer_id in due: # Atomic claim — only one worker wins if redis.ZREM("timers", timer_id) == 1: payload = redis.HGET("timer_payloads", timer_id) execute(payload) redis.HDEL("timer_payloads", timer_id) # else: another worker claimed it sleep(0.1)

At-least-once delivery

What if a worker dies after ZREM but before execute? Move the claim to a processing set with a visibility timeout instead of deleting it outright.

# Move to "processing" set instead of removing redis.ZREM("timers", timer_id) redis.ZADD("processing", {timer_id: now + 30}) # 30s timeout execute(payload) redis.ZREM("processing", timer_id) # Only after success # Separate reaper process: # scans "processing", moves expired items back to "timers"

5 · Pattern: priority queue — score = priority (inverted)

The problem
Multiple producers, multiple consumers. Higher-priority items processed first.
# Producer: enqueue with priority # Lower score = higher priority (ZPOPMIN gets lowest) priorities = {"CRITICAL": 0, "HIGH": 100, "MEDIUM": 200, "LOW": 300} job_id = f"job:{uuid.uuid4()}" redis.ZADD("job_queue", {job_id: priorities["HIGH"]})
# Consumer: atomic pop while True: result = redis.ZPOPMIN("job_queue", 1) if result: job_id, priority = result[0] process(job_id) else: sleep(0.1) # Queue empty

Preventing starvation

Use score = priority * 1_000_000 + timestamp. Within the same priority band, items stay FIFO by arrival time.

score = priorities["LOW"] * 1_000_000 + time.time() # LOW jobs still ordered by arrival time

6 · Pattern: rate limiter — sliding window

def is_allowed(user_id, limit=100, window=60): key = f"ratelimit:{user_id}" now = time.time() window_start = now - window # Remove old entries redis.ZREMRANGEBYSCORE(key, 0, window_start) # Count requests in window count = redis.ZCOUNT(key, window_start, now) if count < limit: # Add this request redis.ZADD(key, {f"{now}:{uuid.uuid4()}": now}) redis.EXPIRE(key, window) # Cleanup return True return False
Race condition
ZCOUNT + ZADD is not atomic. For strict enforcement, wrap both in a Lua script — or accept a slight over-limit under contention.

7 · Atomicity: your distributed lock

Redis is single-threaded for command execution. Each command completes fully before the next starts. That's the whole basis of "no distributed lock needed" — the return value of an atomic mutating command is the lock.

Two workers, same timer

Worker A Worker B ZRANGEBYSCORE sees ["job:abc"] ZRANGEBYSCORE sees ["job:abc"] ZREM "job:abc" returns 1 wins the claim ZREM "job:abc" returns 0 already claimed execute(job) skip (already claimed)
Both workers see the same due timer, but Redis serializes their ZREMs — only one gets 1. The return value is the lock.

Atomic commands for coordination

CommandAtomic guarantee
ZREMOnly one caller gets 1
ZPOPMINReturns AND removes atomically
SETNXSet only if not exists
INCRIncrement and return
GETDELGet and delete in one op
Not atomic
if ZSCORE(x): ZREM(x) — two commands means a race condition. Use a single command or a Lua script.

8 · Durability

Redis persistence options

ModeHowData lossPerformance
NoneRAM onlyEverythingFastest
RDBSnapshot every N secLast N secondsFast
AOF (everysec)Append log, fsync/sec~1 secondGood
AOF (always)fsync every writeNoneSlow

Tiered storage strategy

For critical long-term timers, keep cold timers in Postgres and promote them into Redis only as they approach their fire time.

Postgres timers > 1 day out Promotion job runs hourly Redis timers < 1 day Workers poll & execute
Cold timers live cheaply in Postgres; an hourly job promotes imminent ones into Redis where workers poll them.
Interview sound bite
"Redis is durable with AOF, but I'd tier it — Postgres for timers more than a day out, Redis for imminent. Worst case Redis dies, I lose a second of timers, and idempotent handlers mean a double-fire is safe."

9 · Scaling — when single Redis isn't enough

ApproachHowTradeoff
Shard by time buckettimers:2026-02-23-14Natural partitioning, but must scan multiple keys
Shard by hashtimers:{hash(id) % N}Even distribution, but lose global ordering
Redis ClusterAutomatic shardingA single ZSET can't span slots
Tiered storageDB for cold, Redis for hotComplexity, but handles billions

Hot key problem

90% of traffic is HIGH priority? That single key becomes a hot shard. Split it across several keys and have consumers fan out over all of them.

# Split hot priority across multiple keys shard = hash(job_id) % 4 key = f"queue:HIGH:{shard}" # Consumer reads from all shards for shard in range(4): result = redis.ZPOPMIN(f"queue:HIGH:{shard}", 1)

10 · Interview cheat sheet

ProblemSay
Timer / Scheduler"ZSET, score = fire_time, poll with ZRANGEBYSCORE, claim with ZREM"
Priority Queue"ZSET, score = priority (lower = higher), consume with ZPOPMIN"
Rate Limiter"ZSET, score = timestamp, ZCOUNT for window, ZREMRANGEBYSCORE to clean"
Distributed Lock"ZREM return value — Redis is single-threaded, only one caller gets 1"
At-least-once"Move to processing set with visibility timeout, reaper moves back if expired"
Durability"AOF everysec for ~1s loss, or tier to Postgres for critical long-term"
Scaling"Shard by time bucket or hash, tier cold data to DB"
The meta-point
The best distributed systems are simple primitives composed carefully: ZSET = skip list + hash. Atomicity = single thread. Lock = check the return value. No magic, just layers.