SYSTEM_DESIGN
System Design: Wiki System (Wikipedia-scale)
System design of a Wikipedia-scale wiki covering collaborative editing with operational transformation, revision history, anti-vandalism detection, and serving 18 billion page views per month.
Requirements
Functional Requirements:
- Users create and edit articles collaboratively using wikitext or a visual editor
- Full revision history with diff viewing between any two revisions
- Real-time collaborative editing with conflict resolution (OT/CRDT)
- Discussion pages for each article for editorial debate
- Cross-article linking, categories, and navigation templates
- Watchlists allowing users to monitor changes to specific articles
Non-Functional Requirements:
- 60 million articles across 300+ languages; 18 billion page views/month
- Article render latency under 200ms for cached pages, under 2 seconds for cache misses
- 99.99% read availability; 99.9% write availability
- Strong consistency for edit submissions (no lost edits); eventual consistency for rendered page caches
- Handle edit conflicts gracefully with merge or notification
Scale Estimation
18 billion page views/month = 7,000 page views/sec average, 15,000/sec peak. Active editors: ~280,000/month making ~30 million edits/month = 11.6 edits/sec. Average article size: 50KB wikitext. 60M articles × 50KB = 3TB of current article text. Revision history: average 50 revisions per article × 60M articles = 3 billion revisions. Storing full text per revision: 150TB (with compression and delta storage, ~15TB). Rendered HTML cache: 60M articles × 200KB average rendered HTML = 12TB. Image and media assets: 100M files totaling ~500TB on media servers.
High-Level Architecture
The architecture separates the read path (serving rendered pages) from the write path (processing edits). The Read Path uses a multi-tier caching strategy: a CDN (Varnish-based cache layer) handles 90% of requests. On CDN miss, the request hits the Application Servers (MediaWiki PHP instances behind HAProxy) which check a local APC object cache, then a distributed Memcached cluster (900+ servers, 100TB+ of cached data). On full cache miss, the application fetches the wikitext from the MariaDB primary database, parses it through the Parsoid engine (wikitext → HTML), caches the rendered output, and returns it.
The Write Path handles edits through a careful concurrency control mechanism. When a user submits an edit, the application checks the base revision against the current head revision. If they match, the edit is applied directly. If they diverge, a three-way merge is attempted using an operational transformation algorithm. If the merge succeeds automatically, the edit proceeds; if conflicts exist, the user is shown a conflict resolution interface. For real-time collaborative editing (VisualEditor), the system uses Operational Transformation (OT) via the VisualEditor's collaboration server, which maintains a shared editing session per article and transforms concurrent operations to preserve intention.
The Parser Pipeline (Parsoid) converts wikitext to semantic HTML5 and back. Parsoid is a Node.js service that produces a rich DOM representation including RDFa annotations for templates, references, and transclusions. Template expansion is the most expensive parsing operation — popular templates used in millions of articles (infoboxes, navigation boxes) are cached at the template level to avoid redundant expansion.
Core Components
Collaborative Editing Engine
The collaborative editing system uses Operational Transformation (OT) for real-time concurrent editing. When multiple users edit the same article simultaneously via VisualEditor, each user's edits are represented as operations (insert, delete, retain) on the document. The OT server maintains a canonical operation history and transforms incoming operations against concurrent ones to preserve user intention. For example, if User A inserts text at position 10 and User B simultaneously deletes text at position 5, User A's operation is transformed to insert at position 9 (accounting for the deletion). The system also supports CRDT-based editing as an experimental alternative, using Yjs for conflict-free merging of concurrent edits without a central server.
Revision Storage & Diffing
Revision history stores every edit ever made. To minimize storage, revisions use delta compression: only the diff between consecutive revisions is stored, with periodic full snapshots (every 100 revisions). The diff algorithm uses a customized Myers diff optimized for wikitext, producing line-level and word-level diffs. Viewing a diff between two arbitrary revisions reconstructs both versions from their nearest snapshots plus accumulated deltas. The system maintains a revision table in MariaDB (rev_id, page_id, user_id, timestamp, comment, sha1) and stores the actual text content in external blob storage (content-addressable storage keyed by SHA-1 hash), enabling deduplication of identical content across revisions.
Anti-Vandalism System
An ML-based vandalism detection system (ORES — Objective Revision Evaluation Service) scores each edit in real-time. The model (gradient-boosted trees trained on labeled edit data) evaluates features: edit size, character entropy, presence of profanity, user reputation score, time since last edit, and whether the edit removes references. Edits scoring above a vandalism threshold are automatically reverted or flagged for human review. The system processes all 11.6 edits/sec with a latency budget of 500ms. A separate bot framework (automated scripts) patrols recent changes and reverts obvious vandalism using rule-based heuristics (e.g., blanking of large article sections).
Database Design
The primary datastore is MariaDB (MySQL fork) in a multi-datacenter replicated configuration. The core schema includes: page (page_id, namespace, title, latest_rev_id, page_len, is_redirect), revision (rev_id, page_id, user_id, timestamp, comment, parent_rev_id, sha1, content_model), text (old_id, old_text compressed with zlib, old_flags). Indexes on page title enable fast lookups; the revision table is indexed on (page_id, rev_id DESC) for efficient history traversal. User data and permissions are in separate tables.
The database uses a primary-replica topology with section-based sharding: core tables (page, revision) are on dedicated servers, while auxiliary tables (watchlist, logging, recentchanges) are on separate servers to isolate write-heavy workloads. A separate analytics database (replicated with a 30-second lag) serves data for dashboards, research, and dump generation. The Memcached layer caches rendered HTML pages, parser output, and frequently accessed metadata (page existence checks for link coloring).
API Design
GET /w/api.php?action=parse&page={title}&format=json— Fetch rendered HTML for an article with metadataPOST /w/api.php?action=edit— Submit an edit; body contains title, text/wikitext, summary, baserevid, token; returns success or conflict infoGET /w/api.php?action=query&prop=revisions&titles={title}&rvlimit=50— Fetch revision history for an articleGET /w/api.php?action=compare&fromrev={rev1}&torev={rev2}— Compute and return diff between two revisions
Scaling & Bottlenecks
The read path bottleneck is the wikitext parser (Parsoid). Parsing a complex article with hundreds of templates can take 5-10 seconds. This is addressed through aggressive caching: once an article is parsed, the rendered HTML is cached in Memcached with invalidation triggered only when the article or any of its transcluded templates are edited. Template-level caching further reduces parse time — a template used in 1M articles is parsed once and the output fragment is reused. Pre-rendering of popular articles during off-peak hours ensures cache warmth.
The write path bottleneck is edit conflict resolution during high-activity events (e.g., breaking news articles receiving dozens of simultaneous edits). The OT system handles concurrent VisualEditor sessions, but wikitext editing falls back to three-way merge, which can fail on structural conflicts. A soft-lock mechanism (edit notices warning that another user is actively editing) reduces conflict frequency. Database write throughput is managed through write-primary routing with read-replica fan-out; the primary handles ~12 writes/sec, well within MariaDB's capacity.
Key Trade-offs
- Operational Transformation vs CRDT for collaborative editing: OT requires a central server and is harder to implement correctly, but provides precise conflict resolution; CRDTs are eventually consistent and serverless but can produce surprising merge results for structured documents — Wikipedia uses OT for its deterministic behavior
- Delta compression vs full revision storage: Delta storage reduces the 150TB full-text requirement to ~15TB, but reconstructing arbitrary revisions requires applying a chain of deltas — periodic snapshots (every 100 revisions) bound the reconstruction cost
- Wikitext vs structured document format: Wikitext is human-readable and has a 20-year ecosystem of tools, but is extremely complex to parse (hundreds of edge cases); migrating to a structured format would break millions of existing articles and editor workflows
- Aggressive caching vs real-time freshness: 90% CDN cache hit rate is essential for handling 15K requests/sec, but cache invalidation on template edits can cascade (editing a navigation template invalidates millions of cached pages) — handled via asynchronous purge queues with priority ordering
GO DEEPER
Master this topic in our 12-week cohort
Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.