System Design: Flashcard & Spaced Repetition App

Requirements

Functional Requirements:

Users create, edit, and organize flashcard decks; cards support rich content (text, images, audio, LaTeX)
Spaced repetition scheduler (SM-2 or FSRS algorithm) determines the optimal next review date for each card
Users can import/export decks in Anki (.apkg) format and share decks publicly or with specific users
Offline-first: users study without internet; reviews sync to server when connectivity resumes
Collaborative decks: multiple contributors can add/edit cards in a shared deck with version history
Study statistics: retention rates, review counts, forecasted daily review load

Non-Functional Requirements:

App must function fully offline with eventual consistency sync
Support 20 million users, each with an average of 5,000 cards
Scheduling computation must handle 100M card reviews/day globally
Sync conflicts (same card edited offline on two devices) must be resolved without data loss
Card media (images, audio) up to 10 MB per card

Scale Estimation

20M users × 5,000 cards average = 100B card records. Each card review record (review log) is ~100 bytes; at 100M reviews/day that's 10 GB of review log writes per day, or ~36 TB/year. The scheduler runs per-card after each review: with 100M reviews/day, that's 1,157 scheduling computations/second — each a simple arithmetic operation, trivially handled in-process. The heavier concern is the "due cards" query: each of 20M users fetches their due cards at session start. If 10% of users open the app simultaneously (2M users), each running a query like SELECT * FROM cards WHERE user_id=? AND next_review <= NOW(), that's 2M queries/second. This requires aggressive caching of due-card sets.*

High-Level Architecture

The app is designed offline-first: the mobile/desktop client maintains a full local copy of the user's card database (SQLite on device). All study sessions write to the local SQLite store. A background sync service reconciles local changes with the server using a last-write-wins conflict resolution strategy with vector clocks for detecting concurrent edits. The server is the source of truth for sharing and backup, but is never in the critical path for studying.

The backend is organized around three services: the deck service (CRUD for cards and decks, sharing permissions), the review service (accepting review logs, running the scheduling algorithm, returning due card counts), and the sync service (diffing local and remote state, resolving conflicts, applying delta patches). The review service maintains a scheduled review table in PostgreSQL, indexed on (user_id, next_review_at) for efficient due-card queries. A Redis cache layer stores each user's due-card count (updated after sync) to avoid hitting PostgreSQL on every app open.

Media assets (images, audio) are stored in S3 with per-user path prefixes. Media sync is handled separately from card metadata sync — media files are content-addressed (hash-based filenames), so identical images across shared decks are stored once and referenced by multiple cards. The client downloads media lazily (on first view of a card) and caches locally with an LRU eviction policy.

Core Components

Spaced Repetition Scheduler

The scheduler implements the SM-2 algorithm (or the newer FSRS algorithm for higher retention accuracy). After each review, the user rates difficulty (1-4: Again, Hard, Good, Easy). SM-2 computes the new interval using: new_interval = old_interval × ease_factor, where ease_factor adjusts based on the rating. FSRS uses a 4-parameter model trained on aggregate review data for more accurate long-term scheduling. The scheduler runs entirely in-process (no DB query needed) — it reads the card's current (interval, ease_factor, due_date) from the local SQLite DB and writes the updated values back. The server re-runs the same algorithm on the review log during sync to derive the authoritative next_review_at value.

Offline Sync Engine

The sync engine uses a CRDT-inspired approach. Each card has a lamport_clock value incremented on every edit. Sync proceeds in three phases: (1) the client uploads its local change log (card edits and review logs since last sync) as a delta batch; (2) the server applies server-side changes since the client's last sync timestamp and computes a diff; (3) the server returns the diff to the client, which applies it to the local SQLite DB. Conflicts (same card edited on client and server since last sync) are resolved by taking the version with the higher lamport_clock; if equal, the server version wins. Review logs (immutable append-only) never conflict — they're merged by union.

Deck Sharing and Discovery Service

Public decks are indexed in Elasticsearch for discovery by topic, language, and tag. When a user subscribes to a shared deck, they receive a read-only copy that syncs updates from the deck owner. Collaborative decks use an operational transform-lite model: each card edit is stamped with the editor's user_id and a deck-level version counter. The deck service applies edits in version order and broadcasts changes to all collaborators via a long-poll endpoint (or WebSocket for real-time collaboration). Version history is stored as a log of deltas (similar to git commits) allowing rollback to any prior state.

Database Design

PostgreSQL for server-side state: decks (deck_id, owner_id, title, description, visibility, version), cards (card_id, deck_id, front_content, back_content, media_refs[], created_at, updated_at, lamport_clock), card_schedules (card_id, user_id, interval_days, ease_factor, next_review_at, review_count, updated_at) — this is the hot table, partitioned by user_id hash. review_logs (log_id, card_id, user_id, rating, reviewed_at, interval_before, interval_after) — append-only, partitioned by reviewed_at month. deck_subscriptions (subscription_id, deck_id, subscriber_id, subscribed_at). The card_schedules table has a composite index on (user_id, next_review_at) for efficient due-card queries.

API Design

POST /sync — body: {last_sync_at, card_deltas: [...], review_logs: [...]}, returns {server_deltas: [...], conflicts: [...], new_sync_at}; the core sync endpoint, called on app foreground
GET /decks/discover?query={q}&tags={t}&language={lang} — returns public decks from Elasticsearch, paginated
GET /users/{user_id}/due-count — returns {due_now: N, due_today: M}; served from Redis, updates after sync
POST /decks/{deck_id}/import — multipart upload of .apkg file, async parse and import, returns job_id

Scaling & Bottlenecks

The due-card query bottleneck is real: SELECT COUNT(*) FROM card_schedules WHERE user_id=? AND next_review_at <= NOW() must run for 2M users simultaneously at peak app-open time (morning commute surge). PostgreSQL with (user_id, next_review_at) index can serve ~5k such queries/second per node. A 400-node PostgreSQL cluster is impractical — instead, maintain a Redis hash per user with the precomputed due count (updated on sync). App open reads from Redis (sub-millisecond) and only queries PostgreSQL during sync to recompute. This reduces DB load by 100x.*

Sync endpoint throughput: if 2M users sync simultaneously (e.g., after a major app release), each sync request uploads ~50 KB of deltas, requiring 100 GB/second of network ingress — beyond a single region's capacity. Use regional sync endpoints (US, EU, Asia) with eventual cross-region replication for shared deck updates. Each region handles its own users' sync traffic independently.

Key Trade-offs

SM-2 vs. FSRS: FSRS produces 15-20% better retention rates but is more computationally complex and harder to audit; SM-2 is battle-tested and understood by users who import from Anki. Offering FSRS as an opt-in setting respects existing user habits.
Offline-first sync vs. real-time collaboration: Offline-first with eventual sync creates conflict scenarios in collaborative decks; fully online (no local DB) eliminates conflicts but breaks the core use case of studying on a plane without Wi-Fi.
Last-write-wins vs. CRDT: LWW is simple but can silently discard edits in edge cases; true CRDTs (e.g., YATA for text) prevent data loss but are complex to implement for rich card content including media references.
Content-addressed media vs. per-card media paths: Content addressing (hash-based filenames) deduplicates identical images across shared decks, reducing storage cost by potentially 50%, but makes deletion (garbage collecting unreferenced media) more complex.