SYSTEM_DESIGN
System Design: Pinterest
System design of Pinterest covering pin ingestion, visual search, interest graph, and the board/pin data model that serves 450 million monthly users browsing billions of images.
Requirements
Functional Requirements:
- Users create Boards and save (Pin) images to them
- Home feed of recommended Pins based on interests and boards
- Visual search: find similar images to a Pin
- Users follow other users and boards
- Search by keyword, category, or visual similarity
- Shopping: link Pins to product pages with price/availability
Non-Functional Requirements:
- 450M MAU, 250M DAU; 300M Pins saved/day
- Home feed load under 200ms; visual search under 500ms
- 99.9% availability; Pins should never be lost
- Image crawling and processing within 24 hours of discovery
Scale Estimation
250M DAU × 20 home feed loads/day = 5B feed loads/day = 57,870 reads/sec. 300M Pins/day saved = 3,472 Pin writes/sec. Pinterest has 200B total Pins across 5B boards. Image storage: average Pin image 200KB compressed → 200B images × 200KB = 40PB of image data. Visual embedding vectors (2048 dimensions, float32) for 200B images = 200B × 8KB = 1.6EB — must be stored in distributed ANN indexes, not loaded fully into RAM.
High-Level Architecture
Pinterest's architecture centers on three systems: the Pin data model, the recommendation engine, and the visual search system. The Pin data model uses a sharded MySQL cluster (100+ shards) with Pin objects containing: pin_id, creator_id, board_id, image_url, source_url, description, tags, and embeddings. Pinterest open-sourced their sharding library (PyMySQL + custom sharding) as part of their migration from a monolith.
The Home Feed recommendation system (called Pinnability) uses a cascade of models: a candidate generator retrieves 10,000 candidates from three sources — board-based (Pins from boards with similar content), interest-based (Pins from inferred interest categories), and social (Pins from followed users). A lightweight scorer (logistic regression on sparse features) reduces to 2,000. A heavy ranker (neural network with 100+ features) produces the final ordered feed. Visual search (VisualSearch) uses a ResNet-based CNN to extract 2048-dimensional embeddings from query images, then runs ANN search (using HNSW graphs) against a distributed embedding index.
Core Components
Pin Ingestion Service
When a user saves a Pin from an external URL, the Pin Ingestion Service fetches the source page, extracts the image, resizes to 5 canonical sizes (60px thumbnail to 1200px), stores originals and resized versions in S3, and writes metadata to MySQL. A background Image Analysis pipeline (running on GPU workers) extracts visual embeddings (ResNet-50), generates text from image content (OCR + captioning), detects objects (YOLO-based), and infers product categories. These signals are stored in a separate feature store indexed by pin_id.
Recommendation Engine (Pinnability)
Pinnability uses a multi-stage pipeline. The candidate generation phase uses three parallel retrieval systems: (1) Board2Vec: Pin embeddings clustered by board topic, ANN retrieval by query embedding; (2) Interest Taxonomy: Pinterest maintains a 6,000-node interest tree; user's inferred interest nodes drive category-based retrieval; (3) Social signals: recent Pins from followed users and boards. The ranker is a deep neural network trained on engagement (save, click, closeup) with features including Pin freshness, pinner authority score, and user-Pin affinity.
Visual Search System
Visual search extracts a CNN embedding from the query image (or a cropped region for 'object search') and performs ANN search over Pinterest's 200B-Pin embedding index. The index is distributed across hundreds of machines using hierarchical HNSW graphs. Each shard holds ~500M Pin embeddings in RAM. Query routing uses consistent hashing by embedding vector hash. Results from all shards are merged and re-ranked by visual similarity score. Response latency target: <500ms for 200B-scale ANN search.
Database Design
Pin metadata lives in a sharded MySQL cluster (Aster, Pinterest's internal MySQL sharding layer) with 100+ shards partitioned by pin_id. Board data is in a separate MySQL cluster sharded by board_id. The follower graph (user → board/user follows) is stored in a dedicated graph store backed by HBase, with adjacency lists for fast traversal. Visual embeddings are stored in a custom distributed ANN index (HNSW-based) on SSD-backed machines (embeddings are too large for RAM at 200B scale).
For the Home Feed, a Redis cluster stores the top-500 precomputed recommendations per user, refreshed every 15 minutes by the Pinnability batch job. The cache is keyed by feed:{user_id} and stores sorted (pin_id, score) pairs. Cache miss rate is low (~5%) because the batch job proactively refreshes feeds before they expire. Shopping data (product price, availability, merchant) is stored in a separate MySQL catalog updated via merchant APIs.
API Design
GET /v3/home_feed/?page_size=25&bookmark={cursor}— Fetch personalized home feed with bookmark-based paginationPOST /v3/pins/— Create a Pin; body includes image_url or multipart image, board_id, descriptionGET /v3/visual_search/?pin_id={id}&crop={x,y,w,h}— Find visually similar Pins to a region of an existing PinGET /v3/search/pins/?query={q}&page_size=25&bookmark={cursor}— Text search over Pins
Scaling & Bottlenecks
The visual embedding ANN index is the hardest component to scale. 200B × 8KB = 1.6EB of float32 vectors cannot fit in RAM. Pinterest uses product quantization (PQ) to compress embeddings from 8KB to 512 bytes (16x compression), enabling 200B embeddings in ~100TB of RAM distributed across the ANN cluster. The compression trades a small accuracy loss (recall@100 drops from 95% to 88%) for 16x memory reduction — an acceptable trade-off.
MySQL sharding is Pinterest's primary operational challenge. With 100+ shards, cross-shard queries (e.g., 'all Pins by user X on board Y from following Z') require scatter-gather across shards. Pinterest's solution is to denormalize: the Pin table stores creator_id directly so user-level Pin queries hit a single shard. For cross-shard aggregations, a Kafka-backed materialized view pipeline maintains pre-joined tables in a read-optimized store.
Key Trade-offs
- Product quantization for embeddings: 16x memory compression with ~7% recall loss enables billion-scale ANN search in affordable RAM — exact search at 200B scale would cost 1,000x more in hardware
- MySQL over Cassandra for Pin data: Pinterest chose MySQL for strong consistency and rich querying despite horizontal scalability challenges — they built a custom sharding layer rather than adopt eventual consistency
- Pull-based feed with precomputation: Running Pinnability in batch every 15 minutes and caching results is cheaper than real-time ranking; the staleness is unnoticeable for a discovery-oriented feed
- Board2Vec embeddings for retrieval: Training embeddings on co-occurrence within boards (boards as sentences, Pins as words, Word2Vec style) captures topic coherence better than raw image similarity alone
GO DEEPER
Master this topic in our 12-week cohort
Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.