SYSTEM_DESIGN

System Design: Product Catalog Service

System design of a scalable product catalog service handling millions of SKUs with heterogeneous attributes, full-text search, and real-time inventory integration.

16 min readUpdated Jan 15, 2025
system-designproduct-catalogelasticsearche-commerce

Requirements

Functional Requirements:

  • Store and retrieve product metadata: title, description, images, pricing, attributes
  • Support heterogeneous attributes per category (electronics have screen_size, clothing has fabric_type)
  • Full-text search with faceted filtering (brand, price range, color, size)
  • Category taxonomy with parent-child hierarchies
  • Bulk import/export for merchant onboarding
  • Version history for product changes (audit trail)

Non-Functional Requirements:

  • Support 100M+ products with 500+ distinct attribute schemas
  • Search latency under 100ms (p99) for keyword + filter queries
  • Read-heavy workload: 50:1 read-to-write ratio
  • 99.99% availability for reads; 99.9% for writes
  • Eventual consistency acceptable between write and search index (target <5 seconds)

Scale Estimation

100M products × 3KB average metadata = 300GB primary data. Images: 8 per product × 200KB = 160TB in object storage. Read QPS: 20,000 product detail reads/sec + 5,000 search queries/sec. Write QPS: 500 product updates/sec (bulk imports can spike to 5,000/sec). The search index (Elasticsearch) with 100M documents at 5KB per document (denormalized with seller info and inventory) = 500GB index size across shards.

High-Level Architecture

The Product Catalog Service follows a CQRS (Command Query Responsibility Segregation) pattern. The write path uses a PostgreSQL database as the source of truth, storing normalized product data with a flexible attribute model. When a product is created or updated, a Change Data Capture (CDC) pipeline (Debezium on PostgreSQL WAL) emits events to a Kafka topic. Consumers downstream update the Elasticsearch search index and the Redis cache layer.

The read path has two branches. For product detail pages (by ID), requests hit a Redis cache (99% hit rate for active products) with fallback to PostgreSQL. For search and browse queries, requests go directly to Elasticsearch, which stores a denormalized product document including category path, seller info, pricing, and inventory status. An API Gateway routes requests to the appropriate path based on the query type.

Core Components

Flexible Attribute Model

The catalog supports 500+ category-specific attribute schemas using a hybrid approach. Core fields (title, description, price, brand) are stored as typed PostgreSQL columns for efficient querying. Category-specific attributes use a JSONB column with JSON Schema validation enforced at the application layer. Each category has a registered schema defining required and optional attributes with types and allowed values. This avoids the EAV (Entity-Attribute-Value) anti-pattern while maintaining schema flexibility. Elasticsearch maps these JSONB attributes to typed fields dynamically for faceted search.

Search & Faceting Engine

Elasticsearch (8-node cluster, 20 shards, 2 replicas) indexes all products with a custom analyzer chain: ICU tokenizer → lowercase → synonym expansion (e.g., 'TV' → 'television') → edge n-gram for autocomplete. Faceted filtering uses Elasticsearch aggregations on keyword fields (brand, color, size). Category-specific facets are configured per category in a facet registry: electronics shows screen_size and RAM facets; clothing shows size and fabric. Search relevance scoring combines BM25 text match with boost factors for in-stock items (+20%), seller rating (log-scaled), and sales velocity.

Bulk Import Pipeline

Merchant onboarding requires importing thousands of products from CSV/XML feeds. The Bulk Import Service accepts file uploads to S3, then processes them via a Kafka-based pipeline: Parse Worker (validates schema, normalizes data) → Enrichment Worker (fetches category suggestions via ML classifier, resolves brand names) → Write Worker (upserts to PostgreSQL in batches of 500). Progress is tracked via a job status table; merchants can poll for completion. Failed rows are collected in an error report with actionable messages.

Database Design

PostgreSQL schema: Products table (product_id UUID PK, seller_id, title, description, brand, category_id FK, base_price DECIMAL, currency, status ENUM, attributes JSONB, created_at, updated_at). Variants table (variant_id, product_id FK, sku, price_override, attributes JSONB — e.g., {"size": "L", "color": "red"}). Categories table uses a materialized path pattern: category_id, name, parent_id, path (e.g., 'Electronics > Phones > Smartphones'). Product_Images table with product_id, image_url, display_order, alt_text.

The Elasticsearch index stores a denormalized document per product combining data from Products, Variants, Categories, and inventory status. The mapping uses dynamic templates: string fields in attributes JSONB are mapped as both text (for search) and keyword (for faceting). Numeric attributes are mapped as scaled_float for range filtering.

API Design

  • GET /api/v1/products/{product_id} — Fetch product detail with variants, images, and category; served from Redis cache
  • GET /api/v1/products/search?q={query}&category={id}&brand={name}&price_min=10&price_max=100&facets=true&page=1&size=20 — Full-text search with facets; served from Elasticsearch
  • POST /api/v1/products — Create a product; body contains title, category_id, attributes, variants; validates against category schema
  • POST /api/v1/products/bulk-import — Upload CSV/XML for bulk product creation; returns job_id for status polling

Scaling & Bottlenecks

Elasticsearch is the primary bottleneck for search-heavy workloads. Scaling strategies: (1) Index partitioning by category — separate indices for top-level categories allow independent scaling of high-traffic categories like Electronics; (2) Read replicas via Elasticsearch replica shards distributed across availability zones; (3) Query result caching using Elasticsearch's built-in request cache for frequently repeated queries (cache hit rate ~40% for popular category browse pages).

The CDC pipeline between PostgreSQL and Elasticsearch must handle write spikes during bulk imports without overwhelming the search cluster. A Kafka consumer group with configurable concurrency (10 consumers for normal load, auto-scaling to 50 during bulk imports) processes change events. Backpressure is managed by Kafka consumer lag monitoring with alerts at 10,000 events and auto-scaling at 50,000 events.

Key Trade-offs

  • JSONB over EAV for flexible attributes: JSONB provides fast reads and natural document querying, but lacks the referential integrity of a normalized EAV model — schema validation at the app layer compensates
  • CQRS with separate read/write stores: Elasticsearch provides fast search while PostgreSQL ensures data integrity, but introduces eventual consistency and operational overhead of keeping stores synchronized
  • Denormalized Elasticsearch documents: Including seller info and inventory in the search document enables single-query search results, but requires re-indexing when upstream data changes
  • Materialized path for categories over nested sets: Materialized path is simpler and handles insertions well, but subtree queries require LIKE prefix matching which is slower than nested set range queries

GO DEEPER

Master this topic in our 12-week cohort

Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.