SYSTEM_DESIGN

System Design: Restaurant Management System

Design a scalable restaurant management system handling menu management, order orchestration, kitchen display systems, inventory tracking, and POS integration for multi-location chains.

16 min readUpdated Jan 15, 2025
system-designrestaurant-managementfood-deliverypos-integration

Requirements

Functional Requirements:

  • Manage restaurant menus with categories, items, modifiers, and pricing across multiple locations
  • Receive and process orders from multiple channels (in-store POS, website, mobile app, third-party delivery platforms)
  • Kitchen Display System (KDS) showing order queue with priority and timing
  • Real-time inventory tracking with automatic 86ing (marking items unavailable) when stock runs out
  • Table management and reservation system for dine-in
  • Reporting dashboard with sales analytics, item popularity, and peak hours analysis

Non-Functional Requirements:

  • Support 500K restaurants with up to 500 locations each
  • Order processing latency under 500ms from placement to KDS display
  • Offline-capable POS that syncs when connectivity resumes
  • 99.99% availability for order processing; 99.9% for analytics
  • Multi-tenant architecture with data isolation between restaurant chains

Scale Estimation

With 500K restaurants averaging 200 orders/day each, that is 100M orders/day = 1,160 orders/sec. During lunch and dinner rushes (4 hours combined), traffic concentrates to 60% of daily volume = 69M orders in 14,400 seconds = 4,800 orders/sec peak. Each order contains 3.5 items on average with 2 modifiers each = 350M line items/day. Menu management: 500K restaurants × 80 items × 10 modifiers = 400M modifier configurations. Inventory updates: each order triggers 3-4 inventory decrements = 400M inventory writes/day.

High-Level Architecture

The restaurant management system follows a multi-tenant SaaS architecture with tenant isolation at the database level (schema-per-tenant on shared PostgreSQL clusters for small restaurants, dedicated instances for enterprise chains). The architecture has three primary planes: the Control Plane handles tenant provisioning, configuration, and menu management; the Data Plane handles real-time order processing and kitchen operations; the Analytics Plane handles reporting and business intelligence.

The Control Plane exposes a REST API consumed by the restaurant admin dashboard (React web app). Menu changes are versioned using event sourcing — every menu modification creates an immutable event, allowing rollback and audit trails. Menu data is published to a read-optimized store (Redis + CDN) consumed by customer-facing ordering channels. The Data Plane uses an event-driven architecture: orders arrive from multiple channels through a Channel Adapter layer (normalizing different formats from UberEats, DoorDash, in-store POS into a canonical order schema) into a central Order Processor that validates, prices, and routes orders to the appropriate kitchen station.

The offline POS capability is critical. In-store POS terminals run a local SQLite database that can accept orders, process payments (via stored payment terminal connection), and print receipts without internet connectivity. When connectivity resumes, a sync protocol reconciles local orders with the cloud backend using conflict-free replicated data types (CRDTs) for inventory counters and last-write-wins for order status.

Core Components

Menu Management Engine

Menus are modeled as a hierarchical structure: Restaurant → Location → Menu (Breakfast/Lunch/Dinner) → Category → Item → Modifier Group → Modifier. Each level supports inheritance and overrides — a chain can define a base menu at the restaurant level that individual locations override (e.g., different pricing by region, location-specific seasonal items). Menu changes are versioned and published through a pipeline: admin edits in the dashboard → validation service checks for consistency (e.g., no orphaned modifier references) → approved changes are atomically applied to the canonical menu store (PostgreSQL) and propagated to all consuming channels via a Kafka event.

Kitchen Display System (KDS)

The KDS is a real-time order queue displayed on screens in the kitchen. Orders are broken into preparation tickets routed to the appropriate kitchen station (grill, fry, salad, drinks) based on item-to-station mapping configured by the restaurant. Each ticket shows items, modifiers, special instructions, order type (dine-in, takeout, delivery), and a countdown timer based on target prep time. The KDS connects to the backend via WebSocket and receives order events from a KDS Router Service. Kitchen staff bump (mark complete) items on the touchscreen, which triggers status updates back through the system. A priority algorithm surfaces delivery orders that need to be ready by a specific pickup time, ensuring drivers are not kept waiting.

Inventory Tracking Service

Inventory is tracked at the ingredient level rather than the menu item level. Each menu item has a Bill of Materials (BOM) mapping — a cheeseburger consumes 1 patty, 1 bun, 1 slice of cheese, 2 pickle slices, etc. When an order is placed, the Inventory Service atomically decrements all constituent ingredients. When any ingredient falls below its configured threshold, the system automatically 86s all menu items that depend on it across all ordering channels (in-store, delivery platforms). Inventory counts use Redis atomic counters (DECRBY) for real-time performance, with periodic persistence to PostgreSQL. End-of-day reconciliation compares actual physical counts entered by staff against the system's calculated counts to detect waste and theft.

Database Design

The primary data store is PostgreSQL with a schema-per-tenant model for data isolation. Each tenant schema contains tables for: menus (menu_id, location_id, name, active_hours), categories (category_id, menu_id, name, sort_order), items (item_id, category_id, name, description, base_price, prep_time_minutes, station), modifier_groups (group_id, item_id, name, min_selections, max_selections), and modifiers (modifier_id, group_id, name, price_delta). Orders use a separate orders table with order_id, location_id, channel (enum), status, items_json, subtotal, tax, total, timestamps.

For multi-location chains requiring cross-location analytics, a denormalized read replica in ClickHouse aggregates data across all locations. The ClickHouse schema uses a single flat table with tenant_id, location_id, order_id, item_id, quantity, revenue, timestamp — optimized for analytical queries like "top selling items across all locations this week." CDC from PostgreSQL to ClickHouse runs via Debezium with sub-minute latency.

API Design

  • PUT /api/v1/restaurants/{restaurant_id}/locations/{location_id}/menus/{menu_id} — Update menu structure; body contains the full menu tree (categories, items, modifiers); versioned with If-Match ETag header
  • POST /api/v1/orders/ingest — Channel-agnostic order ingestion endpoint; body contains channel_id, canonical order payload; returns order_id and estimated prep time
  • GET /api/v1/locations/{location_id}/kds/stream — WebSocket endpoint for KDS; streams order tickets and receives bump events
  • GET /api/v1/locations/{location_id}/inventory?items={item_ids} — Check real-time availability for specific menu items

Scaling & Bottlenecks

The multi-tenant database is the primary scaling concern. With 500K tenants on schema-per-tenant PostgreSQL, connection pooling is critical — PgBouncer in transaction mode with a shared connection pool prevents the server from being overwhelmed by idle tenant connections. For large enterprise chains (1,000+ locations), a dedicated PostgreSQL instance is provisioned to avoid noisy neighbor effects. The menu publishing pipeline can bottleneck during mass menu updates (e.g., a chain updating prices across 500 locations simultaneously) — this is handled by a queue-based publish system that processes location updates in parallel batches of 50.

The inventory counter system (Redis DECRBY) can experience race conditions during extreme concurrent ordering at a single location. This is addressed using Redis Lua scripts that atomically check-and-decrement all ingredients for an order, returning failure if any ingredient would go negative. The KDS WebSocket connections are stateful and require sticky routing — a restaurant losing its KDS connection during peak service is operationally critical, so automatic reconnection with state replay (resending all active tickets) is implemented.

Key Trade-offs

  • Schema-per-tenant over row-level multi-tenancy: Schema isolation provides stronger data boundaries and simpler per-tenant backup/restore, but increases operational complexity with 500K schemas — mitigated by automation and connection pooling
  • Redis counters for inventory over database writes: Atomic in-memory decrements handle the burst concurrency of peak ordering, but Redis is volatile — periodic PostgreSQL snapshots and end-of-day reconciliation provide durability
  • Ingredient-level inventory over item-level: Tracking at the ingredient level enables automatic 86ing across all dependent items when one ingredient runs out, but requires maintaining accurate BOMs for every menu item — a significant data entry burden for restaurants
  • Offline-first POS with CRDT sync over cloud-only: Ensuring the POS works without internet is non-negotiable for restaurants, but CRDT-based sync adds complexity and can produce surprising merge results for inventory counts — manual reconciliation is the safety net

GO DEEPER

Master this topic in our 12-week cohort

Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.