AI EducademyAIEducademy
🌳

AI 学习路径

🌱
AI 种子

从零开始

🌿
AI 萌芽

打好基础

🌳
AI 枝干

付诸实践

🏕️
AI 树冠

深入探索

🌲
AI 森林

精通AI

🔨

工程技能路径

✏️
AI 草图

从零开始

🪨
AI 雕刻

打好基础

⚒️
AI 匠心

付诸实践

💎
AI 打磨

深入探索

🏆
AI 杰作

精通AI

查看所有学习计划→

实验室

已加载 7 个实验
🧠神经网络游乐场🤖AI 还是人类?💬提示实验室🎨图像生成器😊情感分析器💡聊天机器人构建器⚖️伦理模拟器
进入实验室→
📝

博客

关于AI、教育和技术的最新文章

阅读博客→
nav.faq
🎯
使命

让AI教育触达每一个人、每一个角落

💜
价值观

开源、多语言、社区驱动

⭐
Open Source

在 GitHub 上公开构建

认识创始人→在 GitHub 上查看
立即开始
AI EducademyAIEducademy

MIT 许可证。开源项目

学习

  • 学习计划
  • 课程
  • 实验室

社区

  • GitHub
  • 参与贡献
  • 行为准则
  • 关于
  • 常见问题

支持

  • 请我喝杯咖啡 ☕
AI & 工程学习计划›🏆 AI 杰作›课程›系统设计模拟面试
🏗️
AI 杰作 • 高级⏱️ 25 分钟阅读

系统设计模拟面试

System Design Mock Interview - A 45-Minute Senior Engineer Simulation

System design interviews separate senior engineers from everyone else. There's no single correct answer - interviewers evaluate your structured thinking, trade-off analysis, and ability to navigate ambiguity. This lesson is a complete 45-minute simulation.

🎯 The Problem: Design a Food Delivery System

"Design a food delivery platform like Uber Eats or Deliveroo that connects customers, restaurants, and delivery drivers."

High-level architecture of a food delivery system with user, restaurant, order, and driver services
The four core domains of a food delivery platform - each one is a bounded context with its own service.

Phase 1: Requirements Gathering (5 minutes)

Never start drawing boxes immediately. Ask clarifying questions first:

Functional requirements:

  • Customers browse restaurants, view menus, place orders, track delivery in real time
  • Restaurants manage menus, accept/reject orders, update preparation status
  • Drivers receive delivery requests, navigate to pickup/dropoff, confirm delivery
  • Real-time estimated delivery time (EDT) shown throughout the order lifecycle

Non-functional requirements:

  • Availability: 99.99% uptime (≈52 minutes downtime per year)
  • Latency: search < 200ms, order placement < 500ms, location updates < 1s
  • Scale: 50M monthly active users, 5M orders per day, 500K concurrent users at peak
  • Consistency: eventual consistency acceptable for search; strong consistency for payments
\ud83e\udde0小测验

Why is eventual consistency acceptable for restaurant search but NOT for payment processing?

Phase 2: API Design (5 minutes)

Define the core contracts before architecture:

# Customer APIs
GET    /api/v1/restaurants?lat={}&lng={}&cuisine={}  → Restaurant[]
GET    /api/v1/restaurants/{id}/menu                  → Menu
POST   /api/v1/orders                                 → Order
GET    /api/v1/orders/{id}/track                      → OrderStatus + DriverLocation

# Restaurant APIs
PATCH  /api/v1/restaurants/{id}/menu/items/{itemId}   → MenuItem
POST   /api/v1/orders/{id}/accept                     → Order
POST   /api/v1/orders/{id}/ready                      → Order

# Driver APIs
POST   /api/v1/drivers/{id}/location                  → void (fire-and-forget)
POST   /api/v1/deliveries/{id}/accept                 → Delivery
POST   /api/v1/deliveries/{id}/complete               → Delivery

Key decision: use WebSockets for real-time driver tracking, REST for everything else.

Phase 3: High-Level Architecture (10 minutes)

Core Services

| Service | Responsibility | Database | |---------|---------------|----------| | User Service | Authentication, profiles, addresses | PostgreSQL | | Restaurant Service | Menus, availability, hours | PostgreSQL + Elasticsearch | | Order Service | Order lifecycle, state machine | PostgreSQL (strong consistency) | | Driver Service | Driver location, availability, matching | Redis (location) + PostgreSQL | | Payment Service | Charges, refunds, restaurant payouts | PostgreSQL (ACID required) | | Notification Service | Push, SMS, email notifications | Message queue consumer |

Communication Patterns

  • Synchronous: API Gateway → individual services (REST/gRPC)
  • Asynchronous: Order events published to Kafka - consumed by Payment, Notification, and Analytics services
  • Real-time: Driver location updates via WebSocket → stored in Redis with geospatial indexing
\ud83e\udd2f
Uber processes over 1 million location updates per second globally. They built their own geospatial indexing system (H3) using hexagonal hierarchical grids because traditional latitude/longitude queries couldn't keep up at scale.

Phase 4: Deep Dive - Critical Components (15 minutes)

Driver Matching Algorithm

When an order is ready for pickup, the matching service must assign the optimal driver:

Score(driver) = w1 × (1 / distance_to_restaurant)
              + w2 × driver_rating
              + w3 × (1 / current_active_orders)
              + w4 × acceptance_rate

Implementation: query Redis GEORADIUS for drivers within 5km of the restaurant, score each candidate, assign the highest-scoring available driver. If declined, offer to the next driver with a 30-second timeout.

Real-Time Order Tracking

Client ←→ WebSocket Gateway ←→ Driver Service
                                    ↑
                              Redis GEO Store
                            (GEOADD, GEOPOS)

Drivers publish GPS coordinates every 4 seconds. The WebSocket gateway subscribes to location updates for active orders and pushes them to the customer's device. Redis Pub/Sub handles fan-out efficiently.

Estimated Delivery Time

EDT is a multi-stage prediction:

EDT = restaurant_prep_time + driver_pickup_travel + dropoff_travel + buffer

Where:
- restaurant_prep_time  → ML model trained on historical order data per restaurant
- driver_pickup_travel  → distance/traffic API (Google Maps, Mapbox)
- dropoff_travel        → same routing API
- buffer                → dynamic, increases during rain/peak hours
\ud83e\udd14
Think about it:The matching algorithm above uses fixed weights (w1–w4). How would you evolve this into a learning system that optimises for delivery speed AND driver satisfaction simultaneously?

Payment Processing

Payments follow a two-phase pattern:

  1. Authorise - reserve funds when order is placed (hold, don't charge)
  2. Capture - charge the actual amount after delivery confirmation

This handles order modifications, cancellations, and partial refunds gracefully. Use an idempotency key on every payment request to prevent double-charging during retries.

\ud83e\udde0小测验

Why use a two-phase payment pattern (authorise then capture) instead of charging immediately?

Phase 5: Scaling and Bottlenecks (10 minutes)

Database Scaling

  • Order Service: shard by order_id - orders are independent, horizontal scaling is straightforward
  • Restaurant Service: read replicas for search queries, Elasticsearch for full-text and geospatial search
  • Driver Location: Redis Cluster with geospatial indexing - pure in-memory, sub-millisecond reads

Handling Peak Load (Surge)

At peak times (Friday 7pm, major sporting events), order volume can spike 10×:

  • Auto-scaling application pods based on queue depth, not CPU
  • Surge pricing to balance supply and demand - increase delivery fees when driver supply is low
  • Rate limiting per user to prevent abuse during peak
  • Circuit breakers on downstream services - graceful degradation over cascading failure

Single Points of Failure

| Component | Mitigation | |-----------|-----------| | API Gateway | Multiple instances behind a load balancer, health checks | | Kafka | Multi-broker cluster with replication factor 3 | | Redis | Redis Sentinel for automatic failover | | Payment provider | Secondary provider with automatic fallback |

\ud83e\udde0小测验

You're scaling the Order Service. Which sharding key provides the BEST write distribution?

\ud83e\udd14
Think about it:Your interviewer asks: "What happens when the payment service is down for 30 seconds during a peak period?" Walk through the exact failure mode, how orders are affected, and what recovery looks like.

💡 Interview Tips

  1. Structure your time - don't spend 30 minutes on requirements
  2. State trade-offs explicitly - "I chose X over Y because..."
  3. Start simple, then scale - design for 1× load first, then address 100×
  4. Draw as you talk - visual communication is half the evaluation
  5. Acknowledge unknowns - "I'd need to benchmark this, but my hypothesis is..."

🎯 Key Takeaways

  • Always start with requirements - functional and non-functional
  • Design API contracts before architecture diagrams
  • Deep dive into 2–3 components, not all of them superficially
  • Every architectural choice involves trade-offs - name them
  • Practise with a timer: 45 minutes, no extensions

📚 Further Reading

  • System Design Interview by Alex Xu - The most practical system design interview preparation book
  • Uber Engineering Blog - Deep dives into real-world systems at massive scale
  • Designing Data-Intensive Applications by Martin Kleppmann - The definitive reference on distributed systems fundamentals
第 9 课,共 10 课已完成 0%
←阅读研究论文
发布一个开源AI项目→