System design interviews are the gatekeepers for senior engineering roles at top tech companies. They test your ability to design scalable, reliable systems under ambiguous requirements. This guide covers the frameworks, patterns, and real examples you need to ace them in 2026.
📋 Table of Contents
The Interview Framework
Every system design interview follows a similar structure. Use this 45-minute framework:
- 1-5 min — Clarify requirements, define scope
- 5-10 min — Estimate scale (users, QPS, storage)
- 10-20 min — High-level design, core components
- 20-35 min — Deep dive on critical components
- 35-45 min — Scale, bottlenecks, trade-offs
Step 1: Clarify Requirements
Never start designing before asking these questions:
- Who are the users? What do they need to do?
- What are the core features vs nice-to-have?
- How many users? Read-heavy or write-heavy?
- What’s the required latency? What’s the consistency requirement?
- What’s the expected data volume? How long is data retained?
Step 2: Capacity Estimation
Rough math that signals engineering maturity:
Example: Design Twitter-like feed
Users: 500M total, 100M daily active (DAU)
Tweets: 100M DAU × 5 tweets/day = 500M tweets/day
Reads: 100M DAU × 100 timeline views/day = 10B reads/day
QPS (writes): 500M / 86400 ≈ 5,800 tweets/sec (peak 3x = 17,000)
QPS (reads): 10B / 86400 ≈ 115,000 reads/sec (peak 3x = 345,000)
Storage (tweets):
- 500M tweets/day × 280 bytes = 140 GB/day
- 5 years = 140 × 365 × 5 ≈ 255 TB of tweet data
Media storage: 10% of tweets have images (100MB avg)
50M tweets/day × 100KB thumbnail = 5 TB/day thumbnail storage
Core Design Patterns
1. Database Selection
Choose the right database for the job:
- Relational (PostgreSQL, MySQL) — ACID transactions, complex queries, financial data
- Document (MongoDB) — flexible schema, nested data, content management
- Key-Value (Redis, DynamoDB) — caching, session storage, O(1) lookups
- Wide-column (Cassandra, HBase) — time-series, write-heavy at massive scale
- Graph (Neo4j) — social graphs, recommendation engines
- Search (Elasticsearch) — full-text search, log analytics
2. Caching Strategy
Cache-Aside (Lazy Loading):
App → cache miss → DB → write to cache → return
Write-Through:
App → write to cache AND DB simultaneously
Write-Back (Write-Behind):
App → write to cache → async write to DB (faster, risk data loss)
Read-Through:
Cache handles DB fetching automatically (used by Redis+Memcached)
Cache Eviction Policies:
LRU — Least Recently Used (most common)
LFU — Least Frequently Used
TTL — Time-To-Live expiry
3. Load Balancing
Distribute traffic across servers:
- Round Robin — equal distribution, simple
- Least Connections — route to server with fewest active connections
- IP Hash — sticky sessions based on client IP
- Weighted — distribute based on server capacity
Tools: AWS ALB, Nginx, HAProxy, Envoy, Cloudflare
4. Database Scaling
Vertical Scaling: Bigger machine (CPU, RAM, SSD)
Pros: Simple, no code changes
Cons: Limits, single point of failure, expensive
Read Replicas: Primary handles writes, replicas handle reads
Pros: Read scalability, disaster recovery
Cons: Replication lag (eventual consistency)
Sharding (Horizontal Partitioning):
Split data across multiple DBs by key
Hash-based: shard = hash(user_id) % num_shards
Pros: Even distribution
Cons: Hard to add shards (resharding)
Range-based: shard by date range or ID range
Pros: Range queries efficient, easy to add new shards
Cons: Hot spots (most recent shard gets all writes)
Directory-based: lookup table maps keys to shards
Pros: Flexible
Cons: Lookup overhead, single point of failure
Case Study: Design URL Shortener (bit.ly)
Requirements
Functional: Shorten URL, redirect on visit, custom aliases, expiry
Non-functional: 100ms read latency, 100M URLs/day writes, 10B reads/day
Scale Estimation
Writes: 100M/day = 1,200 QPS (peak 3,600)
Reads: 10B/day = 115,000 QPS (peak 345,000) — read-heavy 100:1
Storage: 100M × 500 bytes = 50 GB/day × 5 years = ~90 TB
High-Level Design
Client → Load Balancer → Web Servers → Cache (Redis)
↓ cache miss
Database (PostgreSQL + replicas)
URL Shortening API:
POST /api/shorten
{ "url": "https://very-long-url.com/path", "alias": "mylink", "expires_at": "2027-01-01" }
→ { "short_url": "https://bit.ly/abc123" }
Redirect API:
GET /{short_code}
→ 301/302 redirect to original URL
Key Generation:
Option 1: Base62 hash of MD5(url) — take first 7 chars
Option 2: Counter + Base62 encode (guaranteed unique)
Option 3: Pre-generate keys in KGS (Key Generation Service)
Database Schema
CREATE TABLE urls (
id BIGSERIAL PRIMARY KEY,
short_code VARCHAR(10) UNIQUE NOT NULL,
original_url TEXT NOT NULL,
user_id BIGINT,
created_at TIMESTAMP DEFAULT NOW(),
expires_at TIMESTAMP,
click_count BIGINT DEFAULT 0
);
CREATE INDEX idx_short_code ON urls(short_code);
CREATE INDEX idx_user_id ON urls(user_id);
Case Study: Design a Chat System (WhatsApp)
Key Challenges
- Real-time delivery — WebSockets for persistent connections
- Message ordering — sequence numbers per conversation
- Offline delivery — store messages until user reconnects
- Read receipts — delivered/read status
- Group chat — fan-out to multiple users
Architecture:
Client ←→ WebSocket Gateway (stateful) ←→ Message Service
↓
Message Queue (Kafka)
↓
Message Store (Cassandra)
- Partitioned by conversation_id
- Ordered by timestamp within partition
Message Flow:
1. User A sends message → WebSocket Gateway (Server A)
2. Gateway publishes to Kafka topic
3. Message Service writes to Cassandra
4. User B's WebSocket Gateway subscribes and delivers
5. If User B offline → push notification via FCM/APNs
Schema (Cassandra - wide column):
messages_by_conversation
conversation_id (partition key)
message_id (clustering key, time-ordered)
sender_id, content, message_type, created_at
Key Trade-offs to Know
- SQL vs NoSQL — ACID vs scale, schema vs flexibility
- Consistency vs Availability — CAP theorem (prefer AP for social, CP for banking)
- Pull vs Push — fan-out on read vs fan-out on write for feeds
- Monolith vs Microservices — simplicity vs scalability/team autonomy
- Synchronous vs Async — latency vs throughput/resilience
- Strong vs Eventual Consistency — correctness vs availability
Common System Design Topics (2026)
- URL Shortener
- Twitter/Social Feed
- WhatsApp/Messenger
- Instagram/Photo Sharing
- YouTube/Video Streaming
- Uber/Ride Sharing
- Airbnb/Booking System
- Google Search
- Rate Limiter
- Distributed Cache (Redis)
- Notification System
- Web Crawler
- Metrics/Monitoring (Prometheus)
- Distributed Message Queue (Kafka)
System design interviews reward clear communication, structured thinking, and awareness of trade-offs over perfect answers. Practice designing 2-3 systems from scratch each week, and always ask clarifying questions before drawing a single box.
📚 You might also like
🔗 Share this article




✍️ Leave a Comment