System Design Interview Guide 2026: Frameworks, Patterns and Case Studies

⏱️5 min read · 958 words

System design interviews are the gatekeepers for senior engineering roles at top tech companies. They test your ability to design scalable, reliable systems under ambiguous requirements. This guide covers the frameworks, patterns, and real examples you need to ace them in 2026.

📋 Table of Contents

The Interview Framework
Step 1: Clarify Requirements
Step 2: Capacity Estimation
Core Design Patterns
Case Study: Design URL Shortener (bit.ly)
Case Study: Design a Chat System (WhatsApp)
Key Trade-offs to Know
Common System Design Topics (2026)

The Interview Framework

Every system design interview follows a similar structure. Use this 45-minute framework:

1-5 min — Clarify requirements, define scope
5-10 min — Estimate scale (users, QPS, storage)
10-20 min — High-level design, core components
20-35 min — Deep dive on critical components
35-45 min — Scale, bottlenecks, trade-offs

Step 1: Clarify Requirements

Never start designing before asking these questions:

Who are the users? What do they need to do?
What are the core features vs nice-to-have?
How many users? Read-heavy or write-heavy?
What’s the required latency? What’s the consistency requirement?
What’s the expected data volume? How long is data retained?

Step 2: Capacity Estimation

Rough math that signals engineering maturity:

Example: Design Twitter-like feed

Users: 500M total, 100M daily active (DAU)
Tweets: 100M DAU × 5 tweets/day = 500M tweets/day
Reads: 100M DAU × 100 timeline views/day = 10B reads/day

QPS (writes): 500M / 86400 ≈ 5,800 tweets/sec (peak 3x = 17,000)
QPS (reads): 10B / 86400 ≈ 115,000 reads/sec (peak 3x = 345,000)

Storage (tweets):
- 500M tweets/day × 280 bytes = 140 GB/day
- 5 years = 140 × 365 × 5 ≈ 255 TB of tweet data

Media storage: 10% of tweets have images (100MB avg)
50M tweets/day × 100KB thumbnail = 5 TB/day thumbnail storage

Core Design Patterns

1. Database Selection

Choose the right database for the job:

Relational (PostgreSQL, MySQL) — ACID transactions, complex queries, financial data
Document (MongoDB) — flexible schema, nested data, content management
Key-Value (Redis, DynamoDB) — caching, session storage, O(1) lookups
Wide-column (Cassandra, HBase) — time-series, write-heavy at massive scale
Graph (Neo4j) — social graphs, recommendation engines
Search (Elasticsearch) — full-text search, log analytics

2. Caching Strategy

Cache-Aside (Lazy Loading):
  App → cache miss → DB → write to cache → return

Write-Through:
  App → write to cache AND DB simultaneously

Write-Back (Write-Behind):
  App → write to cache → async write to DB (faster, risk data loss)

Read-Through:
  Cache handles DB fetching automatically (used by Redis+Memcached)

Cache Eviction Policies:
  LRU  — Least Recently Used (most common)
  LFU  — Least Frequently Used
  TTL  — Time-To-Live expiry

3. Load Balancing

Distribute traffic across servers:

Round Robin — equal distribution, simple
Least Connections — route to server with fewest active connections
IP Hash — sticky sessions based on client IP
Weighted — distribute based on server capacity

Tools: AWS ALB, Nginx, HAProxy, Envoy, Cloudflare

4. Database Scaling

Vertical Scaling: Bigger machine (CPU, RAM, SSD)
  Pros: Simple, no code changes
  Cons: Limits, single point of failure, expensive

Read Replicas: Primary handles writes, replicas handle reads
  Pros: Read scalability, disaster recovery
  Cons: Replication lag (eventual consistency)

Sharding (Horizontal Partitioning):
  Split data across multiple DBs by key

  Hash-based: shard = hash(user_id) % num_shards
    Pros: Even distribution
    Cons: Hard to add shards (resharding)

  Range-based: shard by date range or ID range
    Pros: Range queries efficient, easy to add new shards
    Cons: Hot spots (most recent shard gets all writes)

  Directory-based: lookup table maps keys to shards
    Pros: Flexible
    Cons: Lookup overhead, single point of failure

Case Study: Design URL Shortener (bit.ly)

Requirements

Functional: Shorten URL, redirect on visit, custom aliases, expiry
Non-functional: 100ms read latency, 100M URLs/day writes, 10B reads/day

Scale Estimation

Writes: 100M/day = 1,200 QPS (peak 3,600)
Reads: 10B/day = 115,000 QPS (peak 345,000) — read-heavy 100:1
Storage: 100M × 500 bytes = 50 GB/day × 5 years = ~90 TB

High-Level Design

Client → Load Balancer → Web Servers → Cache (Redis)
                                          ↓ cache miss
                                       Database (PostgreSQL + replicas)

URL Shortening API:
  POST /api/shorten
  { "url": "https://very-long-url.com/path", "alias": "mylink", "expires_at": "2027-01-01" }
  → { "short_url": "https://bit.ly/abc123" }

Redirect API:
  GET /{short_code}
  → 301/302 redirect to original URL

Key Generation:
  Option 1: Base62 hash of MD5(url) — take first 7 chars
  Option 2: Counter + Base62 encode (guaranteed unique)
  Option 3: Pre-generate keys in KGS (Key Generation Service)

Database Schema

CREATE TABLE urls (
  id           BIGSERIAL PRIMARY KEY,
  short_code   VARCHAR(10) UNIQUE NOT NULL,
  original_url TEXT NOT NULL,
  user_id      BIGINT,
  created_at   TIMESTAMP DEFAULT NOW(),
  expires_at   TIMESTAMP,
  click_count  BIGINT DEFAULT 0
);

CREATE INDEX idx_short_code ON urls(short_code);
CREATE INDEX idx_user_id ON urls(user_id);

Case Study: Design a Chat System (WhatsApp)

Key Challenges

Real-time delivery — WebSockets for persistent connections
Message ordering — sequence numbers per conversation
Offline delivery — store messages until user reconnects
Read receipts — delivered/read status
Group chat — fan-out to multiple users

Architecture:

Client ←→ WebSocket Gateway (stateful) ←→ Message Service
                                              ↓
                                    Message Queue (Kafka)
                                              ↓
                                    Message Store (Cassandra)
                                    - Partitioned by conversation_id
                                    - Ordered by timestamp within partition

Message Flow:
1. User A sends message → WebSocket Gateway (Server A)
2. Gateway publishes to Kafka topic
3. Message Service writes to Cassandra
4. User B's WebSocket Gateway subscribes and delivers
5. If User B offline → push notification via FCM/APNs

Schema (Cassandra - wide column):
  messages_by_conversation
    conversation_id (partition key)
    message_id (clustering key, time-ordered)
    sender_id, content, message_type, created_at

Key Trade-offs to Know

SQL vs NoSQL — ACID vs scale, schema vs flexibility
Consistency vs Availability — CAP theorem (prefer AP for social, CP for banking)
Pull vs Push — fan-out on read vs fan-out on write for feeds
Monolith vs Microservices — simplicity vs scalability/team autonomy
Synchronous vs Async — latency vs throughput/resilience
Strong vs Eventual Consistency — correctness vs availability

Common System Design Topics (2026)

URL Shortener
Twitter/Social Feed
WhatsApp/Messenger
Instagram/Photo Sharing
YouTube/Video Streaming
Uber/Ride Sharing
Airbnb/Booking System
Google Search
Rate Limiter
Distributed Cache (Redis)
Notification System
Web Crawler
Metrics/Monitoring (Prometheus)
Distributed Message Queue (Kafka)

System design interviews reward clear communication, structured thinking, and awareness of trade-offs over perfect answers. Practice designing 2-3 systems from scratch each week, and always ask clarifying questions before drawing a single box.

📚 You might also like

🔗 Share this article

X / Twitter Facebook WhatsApp LinkedIn Telegram