System Design: Pinterest

Requirements

Functional Requirements:

Users create Boards and save (Pin) images to them
Home feed of recommended Pins based on interests and boards
Visual search: find similar images to a Pin
Users follow other users and boards
Search by keyword, category, or visual similarity
Shopping: link Pins to product pages with price/availability

Non-Functional Requirements:

450M MAU, 250M DAU; 300M Pins saved/day
Home feed load under 200ms; visual search under 500ms
99.9% availability; Pins should never be lost
Image crawling and processing within 24 hours of discovery

Scale Estimation

250M DAU × 20 home feed loads/day = 5B feed loads/day = 57,870 reads/sec. 300M Pins/day saved = 3,472 Pin writes/sec. Pinterest has 200B total Pins across 5B boards. Image storage: average Pin image 200KB compressed → 200B images × 200KB = 40PB of image data. Visual embedding vectors (2048 dimensions, float32) for 200B images = 200B × 8KB = 1.6EB — must be stored in distributed ANN indexes, not loaded fully into RAM.

High-Level Architecture

Pinterest's architecture centers on three systems: the Pin data model, the recommendation engine, and the visual search system. The Pin data model uses a sharded MySQL cluster (100+ shards) with Pin objects containing: pin_id, creator_id, board_id, image_url, source_url, description, tags, and embeddings. Pinterest open-sourced their sharding library (PyMySQL + custom sharding) as part of their migration from a monolith.

The Home Feed recommendation system (called Pinnability) uses a cascade of models: a candidate generator retrieves 10,000 candidates from three sources — board-based (Pins from boards with similar content), interest-based (Pins from inferred interest categories), and social (Pins from followed users). A lightweight scorer (logistic regression on sparse features) reduces to 2,000. A heavy ranker (neural network with 100+ features) produces the final ordered feed. Visual search (VisualSearch) uses a ResNet-based CNN to extract 2048-dimensional embeddings from query images, then runs ANN search (using HNSW graphs) against a distributed embedding index.

Core Components

Pin Ingestion Service

When a user saves a Pin from an external URL, the Pin Ingestion Service fetches the source page, extracts the image, resizes to 5 canonical sizes (60px thumbnail to 1200px), stores originals and resized versions in S3, and writes metadata to MySQL. A background Image Analysis pipeline (running on GPU workers) extracts visual embeddings (ResNet-50), generates text from image content (OCR + captioning), detects objects (YOLO-based), and infers product categories. These signals are stored in a separate feature store indexed by pin_id.

Recommendation Engine (Pinnability)

Pinnability uses a multi-stage pipeline. The candidate generation phase uses three parallel retrieval systems: (1) Board2Vec: Pin embeddings clustered by board topic, ANN retrieval by query embedding; (2) Interest Taxonomy: Pinterest maintains a 6,000-node interest tree; user's inferred interest nodes drive category-based retrieval; (3) Social signals: recent Pins from followed users and boards. The ranker is a deep neural network trained on engagement (save, click, closeup) with features including Pin freshness, pinner authority score, and user-Pin affinity.

Visual Search System

Visual search extracts a CNN embedding from the query image (or a cropped region for 'object search') and performs ANN search over Pinterest's 200B-Pin embedding index. The index is distributed across hundreds of machines using hierarchical HNSW graphs. Each shard holds ~500M Pin embeddings in RAM. Query routing uses consistent hashing by embedding vector hash. Results from all shards are merged and re-ranked by visual similarity score. Response latency target: <500ms for 200B-scale ANN search.

Database Design

Pin metadata lives in a sharded MySQL cluster (Aster, Pinterest's internal MySQL sharding layer) with 100+ shards partitioned by pin_id. Board data is in a separate MySQL cluster sharded by board_id. The follower graph (user → board/user follows) is stored in a dedicated graph store backed by HBase, with adjacency lists for fast traversal. Visual embeddings are stored in a custom distributed ANN index (HNSW-based) on SSD-backed machines (embeddings are too large for RAM at 200B scale).

For the Home Feed, a Redis cluster stores the top-500 precomputed recommendations per user, refreshed every 15 minutes by the Pinnability batch job. The cache is keyed by feed:{user_id} and stores sorted (pin_id, score) pairs. Cache miss rate is low (~5%) because the batch job proactively refreshes feeds before they expire. Shopping data (product price, availability, merchant) is stored in a separate MySQL catalog updated via merchant APIs.

API Design

GET /v3/home_feed/?page_size=25&bookmark={cursor} — Fetch personalized home feed with bookmark-based pagination
POST /v3/pins/ — Create a Pin; body includes image_url or multipart image, board_id, description
GET /v3/visual_search/?pin_id={id}&crop={x,y,w,h} — Find visually similar Pins to a region of an existing Pin
GET /v3/search/pins/?query={q}&page_size=25&bookmark={cursor} — Text search over Pins

Scaling & Bottlenecks

The visual embedding ANN index is the hardest component to scale. 200B × 8KB = 1.6EB of float32 vectors cannot fit in RAM. Pinterest uses product quantization (PQ) to compress embeddings from 8KB to 512 bytes (16x compression), enabling 200B embeddings in ~100TB of RAM distributed across the ANN cluster. The compression trades a small accuracy loss (recall@100 drops from 95% to 88%) for 16x memory reduction — an acceptable trade-off.

MySQL sharding is Pinterest's primary operational challenge. With 100+ shards, cross-shard queries (e.g., 'all Pins by user X on board Y from following Z') require scatter-gather across shards. Pinterest's solution is to denormalize: the Pin table stores creator_id directly so user-level Pin queries hit a single shard. For cross-shard aggregations, a Kafka-backed materialized view pipeline maintains pre-joined tables in a read-optimized store.

Key Trade-offs

Product quantization for embeddings: 16x memory compression with ~7% recall loss enables billion-scale ANN search in affordable RAM — exact search at 200B scale would cost 1,000x more in hardware
MySQL over Cassandra for Pin data: Pinterest chose MySQL for strong consistency and rich querying despite horizontal scalability challenges — they built a custom sharding layer rather than adopt eventual consistency
Pull-based feed with precomputation: Running Pinnability in batch every 15 minutes and caching results is cheaper than real-time ranking; the staleness is unnoticeable for a discovery-oriented feed
Board2Vec embeddings for retrieval: Training embeddings on co-occurrence within boards (boards as sentences, Pins as words, Word2Vec style) captures topic coherence better than raw image similarity alone