Caching is the single most effective technique for improving system performance. A well-designed distributed cache can reduce database load by 90%+ and cut response times from hundreds of milliseconds to sub-millisecond.
The most common pattern. Application checks the cache first; on a miss, it reads from the database, then populates the cache:
GET user:123 → Cache HIT → return cached data
GET user:456 → Cache MISS → query DB → SET cache → return data
Pros: only caches what's actually requested. Cons: cache misses incur latency; data can become stale.
The cache itself handles database reads on a miss. The application only talks to the cache layer, which abstracts the data source.
Every write goes to both the cache and the database synchronously. Ensures cache is always consistent but adds write latency.
Writes go to the cache immediately; the cache asynchronously flushes to the database in batches. Lower write latency but risks data loss if the cache node crashes before flushing.
| Strategy | Consistency | Read Latency | Write Latency | Use Case | |----------|-------------|-------------|---------------|----------| | Cache-Aside | Eventual | Miss penalty | N/A | General purpose | | Read-Through | Eventual | Miss penalty | N/A | Simplified reads | | Write-Through | Strong | Fast | Slower | Consistency-critical | | Write-Behind | Eventual | Fast | Fast | Write-heavy workloads |
An e-commerce site needs to cache product prices that change infrequently but must be accurate within 5 minutes. Which strategy is most appropriate?
When the cache is full, which entries do we remove?
With hash(key) % N nodes, adding or removing a node remaps nearly all keys, causing a massive cache miss storm.
Consistent hashing maps both keys and nodes onto a ring. Each key is assigned to the next node clockwise on the ring:
Hash Ring (0 to 2^32):
Node A (3 virtual nodes: A1, A2, A3)
Node B (3 virtual nodes: B1, B2, B3)
Node C (3 virtual nodes: C1, C2, C3)
Key "user:123" → hash → lands between B2 and A1 → assigned to Node A
Virtual nodes solve the uneven distribution problem. Each physical node gets multiple positions on the ring, ensuring balanced load even with few physical nodes.
When a node is added or removed, only keys adjacent to it on the ring are remapped - approximately K/N keys (where K is total keys and N is node count), not all of them.
A cache cluster has 4 nodes. Node 3 fails. With simple modulo hashing (key % 4 → key % 3), what percentage of keys get remapped?
A cache stampede occurs when a popular cached entry expires and hundreds of concurrent requests simultaneously miss the cache and hit the database.
Mutex/Locking: When a cache miss occurs, acquire a distributed lock. Only one request fetches from the database; others wait for the cache to be repopulated.
Probabilistic Early Expiry: Each request has a small random chance of refreshing the cache before the TTL actually expires. This spreads refresh load over time rather than creating a thundering herd at expiry.
Background Refresh: A background process refreshes popular cache entries before they expire. The cache is never empty for hot keys.
Redis Cluster is the most widely deployed distributed cache:
CRC16(key) % 16384Slots 0-5460: Master A ←→ Replica A'
Slots 5461-10922: Master B ←→ Replica B'
Slots 10923-16383: Master C ←→ Replica C'
| Pattern | Implementation | TTL |
|---------|---------------|-----|
| Session store | SET session:{id} {data} | 30 minutes |
| API response cache | SET api:/users/123 {json} | 5 minutes |
| Database query cache | SET query:{hash} {result} | 1 minute |
| Rate limiter counter | INCR ratelimit:{user}:{window} | Window duration |
| Feature flags | HSET features {flag} {value} | 10 seconds |
The hardest problem in computer science (alongside naming things):
Which cache invalidation approach provides the strongest consistency guarantee?