System Design Mock Interview - A 45-Minute Senior Engineer Simulation
System design interviews separate senior engineers from everyone else. There's no single correct answer - interviewers evaluate your structured thinking, trade-off analysis, and ability to navigate ambiguity. This lesson is a complete 45-minute simulation.
🎯 The Problem: Design a Food Delivery System
"Design a food delivery platform like Uber Eats or Deliveroo that connects customers, restaurants, and delivery drivers."
The four core domains of a food delivery platform - each one is a bounded context with its own service.
Phase 1: Requirements Gathering (5 minutes)
Never start drawing boxes immediately. Ask clarifying questions first:
Functional requirements:
Customers browse restaurants, view menus, place orders, track delivery in real time
Restaurants manage menus, accept/reject orders, update preparation status
Drivers receive delivery requests, navigate to pickup/dropoff, confirm delivery
Real-time estimated delivery time (EDT) shown throughout the order lifecycle
Non-functional requirements:
Availability: 99.99% uptime (≈52 minutes downtime per year)
Scale: 50M monthly active users, 5M orders per day, 500K concurrent users at peak
Consistency: eventual consistency acceptable for search; strong consistency for payments
\ud83e\udde0Verificação Rápida
Why is eventual consistency acceptable for restaurant search but NOT for payment processing?
Phase 2: API Design (5 minutes)
Define the core contracts before architecture:
# Customer APIs
GET /api/v1/restaurants?lat={}&lng={}&cuisine={} → Restaurant[]
GET /api/v1/restaurants/{id}/menu → Menu
POST /api/v1/orders → Order
GET /api/v1/orders/{id}/track → OrderStatus + DriverLocation
# Restaurant APIs
PATCH /api/v1/restaurants/{id}/menu/items/{itemId} → MenuItem
POST /api/v1/orders/{id}/accept → Order
POST /api/v1/orders/{id}/ready → Order
# Driver APIs
POST /api/v1/drivers/{id}/location → void (fire-and-forget)
POST /api/v1/deliveries/{id}/accept → Delivery
POST /api/v1/deliveries/{id}/complete → Delivery
Key decision: use WebSockets for real-time driver tracking, REST for everything else.
Phase 3: High-Level Architecture (10 minutes)
Core Services
| Service | Responsibility | Database |
|---------|---------------|----------|
| User Service | Authentication, profiles, addresses | PostgreSQL |
| Restaurant Service | Menus, availability, hours | PostgreSQL + Elasticsearch |
| Order Service | Order lifecycle, state machine | PostgreSQL (strong consistency) |
| Driver Service | Driver location, availability, matching | Redis (location) + PostgreSQL |
| Payment Service | Charges, refunds, restaurant payouts | PostgreSQL (ACID required) |
| Notification Service | Push, SMS, email notifications | Message queue consumer |
Communication Patterns
Synchronous: API Gateway → individual services (REST/gRPC)
Asynchronous: Order events published to Kafka - consumed by Payment, Notification, and Analytics services
Real-time: Driver location updates via WebSocket → stored in Redis with geospatial indexing
\ud83e\udd2f
Uber processes over 1 million location updates per second globally. They built their own geospatial indexing system (H3) using hexagonal hierarchical grids because traditional latitude/longitude queries couldn't keep up at scale.
Phase 4: Deep Dive - Critical Components (15 minutes)
Driver Matching Algorithm
When an order is ready for pickup, the matching service must assign the optimal driver:
Implementation: query Redis GEORADIUS for drivers within 5km of the restaurant, score each candidate, assign the highest-scoring available driver. If declined, offer to the next driver with a 30-second timeout.
Real-Time Order Tracking
Client ←→ WebSocket Gateway ←→ Driver Service
↑
Redis GEO Store
(GEOADD, GEOPOS)
Drivers publish GPS coordinates every 4 seconds. The WebSocket gateway subscribes to location updates for active orders and pushes them to the customer's device. Redis Pub/Sub handles fan-out efficiently.
Estimated Delivery Time
EDT is a multi-stage prediction:
EDT = restaurant_prep_time + driver_pickup_travel + dropoff_travel + buffer
Where:
- restaurant_prep_time → ML model trained on historical order data per restaurant
- driver_pickup_travel → distance/traffic API (Google Maps, Mapbox)
- dropoff_travel → same routing API
- buffer → dynamic, increases during rain/peak hours
\ud83e\udd14
Think about it:The matching algorithm above uses fixed weights (w1–w4). How would you evolve this into a learning system that optimises for delivery speed AND driver satisfaction simultaneously?
Payment Processing
Payments follow a two-phase pattern:
Authorise - reserve funds when order is placed (hold, don't charge)
Capture - charge the actual amount after delivery confirmation
This handles order modifications, cancellations, and partial refunds gracefully. Use an idempotency key on every payment request to prevent double-charging during retries.
\ud83e\udde0Verificação Rápida
Why use a two-phase payment pattern (authorise then capture) instead of charging immediately?
Phase 5: Scaling and Bottlenecks (10 minutes)
Database Scaling
Order Service: shard by order_id - orders are independent, horizontal scaling is straightforward
Restaurant Service: read replicas for search queries, Elasticsearch for full-text and geospatial search
Driver Location: Redis Cluster with geospatial indexing - pure in-memory, sub-millisecond reads
Handling Peak Load (Surge)
At peak times (Friday 7pm, major sporting events), order volume can spike 10×:
Auto-scaling application pods based on queue depth, not CPU
Surge pricing to balance supply and demand - increase delivery fees when driver supply is low
Rate limiting per user to prevent abuse during peak
Circuit breakers on downstream services - graceful degradation over cascading failure
Single Points of Failure
| Component | Mitigation |
|-----------|-----------|
| API Gateway | Multiple instances behind a load balancer, health checks |
| Kafka | Multi-broker cluster with replication factor 3 |
| Redis | Redis Sentinel for automatic failover |
| Payment provider | Secondary provider with automatic fallback |
\ud83e\udde0Verificação Rápida
You're scaling the Order Service. Which sharding key provides the BEST write distribution?
\ud83e\udd14
Think about it:Your interviewer asks: "What happens when the payment service is down for 30 seconds during a peak period?" Walk through the exact failure mode, how orders are affected, and what recovery looks like.
💡 Interview Tips
Structure your time - don't spend 30 minutes on requirements
State trade-offs explicitly - "I chose X over Y because..."
Start simple, then scale - design for 1× load first, then address 100×
Draw as you talk - visual communication is half the evaluation
Acknowledge unknowns - "I'd need to benchmark this, but my hypothesis is..."
🎯 Key Takeaways
Always start with requirements - functional and non-functional
Design API contracts before architecture diagrams
Deep dive into 2–3 components, not all of them superficially
Every architectural choice involves trade-offs - name them