System Design: Blogging Platform (Medium-scale)

Requirements

Functional Requirements:

Authors create and publish articles using a rich text editor (WYSIWYG) with inline images, code blocks, and embeds
Readers browse a personalized home feed of recommended articles based on interests and reading history
Users follow authors and topics; followers receive notifications on new publications
Clap/reaction system allowing readers to appreciate articles (up to 50 claps per reader per article)
Full-text search across all published articles with faceted filtering by topic, author, and date
Reading lists and bookmarks for saving articles to read later

Non-Functional Requirements:

100 million MAU, 30 million DAU; 50,000 new articles published daily
Article page load time under 1.5 seconds globally (p95)
99.95% availability; published articles must never be lost
Eventual consistency for recommendation feeds and clap counts; strong consistency for article content and publication state
Support articles up to 50,000 words with embedded media

Scale Estimation

30M DAU with average 5 article reads per session = 150M article reads/day = 1,736 reads/sec average, ~5,000/sec peak. 50,000 new articles/day = 0.6 articles/sec. Average article size: 15KB text + 500KB images = 515KB. Total daily storage for new articles: 50K × 515KB = 25GB/day. Accumulated over 10 years: ~90TB. Read-heavy workload with a 2,500:1 read-to-write ratio. Image CDN serves roughly 750M image requests/day at peak. Recommendation engine processes 150M read events daily to retrain personalization models.

High-Level Architecture

The platform follows a service-oriented architecture with four main planes. The Authoring Plane handles content creation: the rich text editor on the client uses ProseMirror (a schema-based editor framework) that produces a structured JSON document (not raw HTML). When an author saves a draft, the Article Service validates the document schema, sanitizes embedded HTML, and stores the JSON document in PostgreSQL. Images are uploaded separately to an Image Service that stores originals in S3, generates responsive variants (320w, 640w, 1024w, 1600w) via a Sharp-based image processing pipeline, and returns CDN URLs that are embedded in the article JSON.

The Reading Plane serves published articles. When a reader requests an article, the Article Serving Service checks a Redis cache (keyed by article_id + version) for the rendered HTML. On cache miss, it fetches the JSON document from PostgreSQL, renders it to HTML with server-side rendering, caches the result, and returns it. A CDN (CloudFront) sits in front of the serving layer, caching full article pages at edge locations with a 5-minute TTL. Cache invalidation on article edit uses CloudFront invalidation API.

The Feed Plane generates personalized home feeds. A Feed Generation Service runs a two-stage recommendation pipeline: (1) candidate retrieval using collaborative filtering on user-topic interaction matrices, pulling ~500 candidate articles; (2) ranking using a gradient-boosted model (LightGBM) scoring articles on predicted read probability, incorporating features like topic affinity, author follow status, article freshness, and historical engagement rate. Feed results are pre-computed hourly for active users and cached in Redis sorted sets.

Core Components

Rich Text Editor & Document Model

The editor uses ProseMirror on the client side, producing a structured JSON document conforming to a custom schema. The schema defines node types (paragraph, heading, image, code_block, blockquote, embed) and mark types (bold, italic, link, code). This structured format enables server-side rendering without executing client JavaScript, prevents XSS attacks (no raw HTML storage), and allows efficient diffs for collaborative editing. The Article Service validates incoming documents against the schema using JSON Schema validation and rejects malformed content. Draft auto-save runs every 30 seconds via a debounced PUT request.

Recommendation Engine

The recommendation system processes 150M read events daily through a Kafka-based streaming pipeline. Events are consumed by a Feature Engineering Service that maintains user-topic affinity vectors (128-dimensional embeddings) in a feature store (Redis + S3). The candidate generation model uses matrix factorization (ALS) trained nightly on a Spark cluster, producing user-article relevance scores. The ranking model (LightGBM) is trained daily on click-through and read-completion data. For cold-start users (no reading history), the system falls back to trending articles within the user's selected topics, scored by recency-weighted engagement rate.

Engagement & Notification Service

The clap system uses an idempotent counter pattern. Each clap event is written to a Kafka topic partitioned by article_id. A consumer aggregates claps into per-article counters stored in DynamoDB (article_id → {total_claps, unique_clappers}). Per-reader clap limits (max 50 per article) are enforced by a Redis set (article_id:reader_id → clap_count) with a 24-hour TTL for rate limiting. Notifications are handled via a fan-out-on-write model for authors with fewer than 100K followers; for mega-authors, notifications are generated lazily on follower feed access.

Database Design

Articles are stored in PostgreSQL with a schema: articles (article_id UUID PK, author_id, title, subtitle, content_json JSONB, status ENUM(draft, published, unlisted, deleted), published_at, updated_at, reading_time_seconds, word_count). A GIN index on content_json enables efficient querying of article structure. Full-text search uses a dedicated Elasticsearch cluster indexing article title, subtitle, plaintext content, tags, and author name. The search index is updated asynchronously via CDC (Change Data Capture) from PostgreSQL using Debezium.

User relationships (follows) are stored in a separate PostgreSQL table: follows (follower_id, followee_id, created_at) with composite primary key and indexes on both follower_id and followee_id for bidirectional lookups. Reading history is stored in Cassandra (partition key: user_id, clustering key: read_at DESC) to handle the write-heavy nature of tracking every article read. Bookmarks use a DynamoDB table (user_id PK, article_id SK) for fast lookup and pagination.

API Design

POST /api/v1/articles — Create a draft article; body contains title, content_json; returns article_id
PUT /api/v1/articles/{article_id}/publish — Publish a draft; triggers CDN cache population and follower notifications
GET /api/v1/feed?cursor={cursor}&limit=20 — Fetch personalized home feed with cursor-based pagination
POST /api/v1/articles/{article_id}/clap — Register a clap (1-50); body contains count; idempotent within reader limit

Scaling & Bottlenecks

The read path is the primary bottleneck given the 2,500:1 read-to-write ratio. Multi-layer caching addresses this: CDN edge caching handles 80% of article reads, Redis caching handles 15%, and only 5% of requests hit PostgreSQL. For viral articles (sudden traffic spikes), the CDN absorbs the burst while origin servers remain protected. Cache stampede on popular article cache expiry is mitigated using probabilistic early expiration (each request has a small chance of refreshing the cache before TTL, spreading the refresh load).

The recommendation pipeline must process 150M events daily and retrain models within a 6-hour window. The Spark cluster auto-scales based on the event backlog in Kafka. Feature serving latency is critical — the Redis-based feature store provides sub-millisecond lookups for the ranking model. For feed pre-computation, only active users (logged in within the last 7 days) have feeds pre-computed; inactive users get feeds generated on-demand with a slightly higher latency.

Key Trade-offs

Structured JSON (ProseMirror) over raw HTML storage: Prevents XSS, enables server-side rendering, and supports schema evolution — but requires a more complex editor implementation and migration path when the schema changes
Pre-computed feeds vs on-demand generation: Pre-computation provides sub-100ms feed latency but consumes significant compute and storage for users who may never check their feed — mitigated by only pre-computing for recently active users
PostgreSQL + Elasticsearch vs PostgreSQL full-text search: Elasticsearch provides superior relevance ranking, faceted search, and typo tolerance, but adds operational complexity and eventual consistency for search results (indexing lag of 1-2 seconds)
Fan-out-on-write for notifications vs fan-out-on-read: Write-time fan-out provides instant notifications but creates write amplification for popular authors — the hybrid approach (write for small followings, lazy for large) balances latency and cost