SYSTEM_DESIGN
System Design: Order Management System
System design of an order management system handling the full order lifecycle from placement through fulfillment, including distributed saga orchestration and multi-warehouse routing.
Requirements
Functional Requirements:
- Accept orders from multiple channels: web, mobile, POS, marketplace integrations
- Orchestrate the order lifecycle: placed → confirmed → processing → shipped → delivered
- Split orders across multiple warehouses based on inventory location
- Handle payment capture, partial fulfillment, and partial refunds
- Support order modifications and cancellations before shipment
- Generate invoices and integrate with shipping carriers for tracking
Non-Functional Requirements:
- Process 10,000 orders/sec at peak (e.g., flash sales, holiday events)
- Order placement to confirmation latency under 2 seconds
- 99.99% availability for order placement; 99.9% for order status queries
- Exactly-once order processing — no duplicate charges or fulfillments
- Audit trail for every state transition with immutable event log
Scale Estimation
5M orders/day average, peaking at 10K/sec during events. Each order has ~3 line items averaging 500 bytes metadata = 1.5KB per order. Daily storage: 5M × 1.5KB = 7.5GB. Annual: ~2.7TB. The event log (every state transition) generates 10 events per order lifecycle × 5M orders = 50M events/day at 200 bytes each = 10GB/day. Order status queries: 20M queries/day = 231 QPS (users checking 'where is my order').
High-Level Architecture
The OMS follows an event-driven architecture with a saga orchestrator at its core. The order placement flow: Cart Service → Order API → Order Service (validates cart, calculates totals) → Saga Orchestrator. The Saga Orchestrator drives a multi-step distributed transaction: (1) Inventory Reservation — reserve stock via the Inventory Service; (2) Payment Authorization — authorize payment via the Payment Service; (3) Fraud Check — score the order via the Fraud Service; (4) Order Confirmation — write the confirmed order to the Order Store. Each step is idempotent with compensating actions: if payment fails, inventory is released; if fraud check fails, payment authorization is voided.
The fulfillment path is triggered by an order.confirmed event on Kafka: the Fulfillment Router determines the optimal warehouse(s) based on item availability and proximity to the shipping address. For multi-warehouse orders, the router creates sub-orders (shipments) each assigned to a warehouse. Warehouse management systems (WMS) consume shipment events, pick/pack items, and emit shipment.shipped events with carrier tracking numbers. A Tracking Aggregator polls carrier APIs and updates the order status.
All state transitions are recorded as immutable events in an event store (Kafka + PostgreSQL), enabling full audit trails and event sourcing for order reconstruction.
Core Components
Saga Orchestrator
The Saga Orchestrator is implemented as a state machine persisted in PostgreSQL. Each order creates a saga record with fields: saga_id, order_id, current_step, status, step_results JSONB, created_at, updated_at. The orchestrator advances through steps sequentially, recording the result of each step. On failure, it executes compensating transactions in reverse order. Idempotency is enforced via idempotency keys on each step — retrying a step with the same key returns the cached result. The orchestrator uses a polling pattern with a 'stuck saga detector' that alerts on sagas stalled for >5 minutes.
Fulfillment Router
The Fulfillment Router determines optimal warehouse assignment for each order. The routing algorithm: (1) Find all warehouses with sufficient stock for each line item; (2) Score each warehouse by: distance to shipping address (40% weight), current utilization/backlog (30%), shipping cost (20%), and carrier availability (10%); (3) If no single warehouse has all items, split into sub-orders minimizing the number of shipments. The router uses a cached inventory snapshot (refreshed every 30 seconds from the Inventory Service) for fast decisions, with a validation call before committing the assignment.
Order Event Store
Every order state transition emits an event to Kafka topic order-events and is persisted to a PostgreSQL events table: event_id, order_id, event_type (order.placed, order.confirmed, shipment.shipped, etc.), payload JSONB, timestamp. This append-only event log enables: (1) Full audit trail for compliance; (2) Event sourcing — reconstructing current order state by replaying events; (3) Analytics — downstream consumers (ClickHouse) build order funnel metrics. Events are retained for 7 years for regulatory compliance.
Database Design
PostgreSQL schema: orders table (order_id UUID PK, customer_id, channel, status ENUM, total_amount, currency, shipping_address JSONB, billing_address JSONB, created_at, updated_at). order_items table (item_id, order_id FK, product_id, variant_id, quantity, unit_price, status). shipments table (shipment_id, order_id FK, warehouse_id, carrier, tracking_number, status, shipped_at, delivered_at). payments table (payment_id, order_id FK, amount, method, authorization_code, status, captured_at).
The orders table is partitioned by created_at (monthly partitions) to manage the growing dataset. Older partitions are moved to cold storage (S3 via pg_dump) after 2 years. Indexes: composite index on (customer_id, created_at DESC) for order history, index on status for operational queries (e.g., find all 'processing' orders).
API Design
POST /api/v1/orders— Place an order; body contains cart_id, shipping_address, payment_method; returns order_id and saga statusGET /api/v1/orders/{order_id}— Fetch order details with line items, shipments, and current statusPOST /api/v1/orders/{order_id}/cancel— Request cancellation; succeeds only if order is in cancellable state (before shipment)GET /api/v1/orders/{order_id}/events— Fetch full event history for an order (audit trail)
Scaling & Bottlenecks
The saga orchestrator is the throughput bottleneck since each order requires sequential step execution. Scaling strategies: (1) Horizontal scaling — saga workers are stateless and pull work from a Kafka partition; order_id-based partitioning ensures a single order's saga runs on one worker; (2) Step parallelization where possible — fraud check and payment authorization can run concurrently after inventory reservation; (3) Fast-path optimization — repeat customers with verified payment methods skip fraud check, reducing saga from 4 steps to 3.
Database write throughput during peak order placement (10K/sec) requires PostgreSQL optimizations: batch inserts (group 100 orders into a single transaction), connection pooling via PgBouncer (200 connections), and synchronous replication to a single standby (async to others) for durability without excessive write latency. The event store Kafka topic uses 64 partitions to parallelize downstream consumption.
Key Trade-offs
- Saga over 2PC: Saga allows independent service scaling and tolerates long-running transactions (fraud checks can take seconds), but requires careful design of compensating transactions and idempotency
- Event sourcing for orders: Full audit trail and state reconstruction capability, but increases storage and adds complexity in querying current state (mitigated by materialized views)
- 30-second inventory snapshot in fulfillment router: Fast routing decisions at the cost of occasional stale inventory — validation before commitment catches most errors
- Monthly table partitioning: Manageable partition sizes for archival, but cross-partition queries (e.g., customer order history spanning years) require partition pruning or a separate read model
GO DEEPER
Master this topic in our 12-week cohort
Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.