Courses 0%
36
Additional Database Concepts · Chapter 36 of 42

Distributed Databases

Akhil
Akhil Sharma
20 min

Distributed Databases

Databases that spread data across multiple nodes and regions — combining the scalability of NoSQL with the transactional guarantees of SQL.

Distributed Databases: When One Server Isn't Enough (Google's Spanner Revolution) 🎯 Challenge 1: The Planetary Library Problem Imagine this scenario: You're building a library system that needs to serve the entire planet - billions of users, petabytes of data, 24/7 availability.

Traditional Single Database (Centralized):

yaml

Distributed Database (Decentralized):

yaml

Pause and think: What if your database could be spread across multiple servers, multiple datacenters, even multiple continents, working as one unified system?

The Answer: Distributed databases split data and processing across multiple nodes while appearing as a single system! It's like: ✅ Data partitioned across multiple servers (horizontal scaling) ✅ Each partition replicated for availability (fault tolerance) ✅ Nodes coordinate using consensus algorithms (consistency) ✅ Queries route to appropriate nodes automatically (transparency) ✅ Scales to planetary size (Google, Amazon, Facebook)

Key Insight: Distributed databases trade simplicity for massive scale and global availability!

🎬 Interactive Exercise: Single vs Distributed Database

Single Database (Monolithic):

yaml

Distributed Database (Horizontal):

yaml

The Trade-off:

Real-world parallel: Single database is like a skyscraper (limited height, expensive). Distributed database is like a city (unlimited growth, add more buildings).

🏗️ Types of Distributed Databases

Type 1: Distributed SQL (NewSQL)

yaml

Type 2: Eventually Consistent NoSQL

yaml

Type 3: Sharded Traditional DB

yaml

Type 4: Distributed Document Stores

yaml

Real-world parallel:

  • Distributed SQL = National chain with strict policies (consistent everywhere)
  • NoSQL = Franchises with local autonomy (eventual sync)
  • Sharded traditional = Multiple branches of same bank
  • Document stores = Flexible filing system across locations

🎮 Decision Game: Which Distributed Database?

Context: You're choosing a database for different use cases.

Scenarios: A. Global e-commerce platform (need ACID for orders) B. Social media feed (billions of posts, eventual consistency OK) C. Real-time analytics (massive data ingestion) D. Financial trading system (strong consistency critical) E. IoT sensor data (millions of devices) F. Multi-tenant SaaS (need isolation) G. Content management system (flexible schema) H. Gaming leaderboard (extremely high writes)

Options:

  1. Distributed SQL (CockroachDB/Spanner)
  2. Eventually Consistent NoSQL (Cassandra/DynamoDB)
  3. Sharded PostgreSQL (Citus)
  4. MongoDB Sharded Cluster

Answers:

🚨 Common Misconception: "Distributed = Eventually Consistent... Right?"

You might think: "All distributed databases sacrifice consistency."

The Reality: Modern distributed databases offer strong consistency!

Understanding Consistency Models:

Eventual Consistency:

Strong Consistency (Linearizability):

Google Spanner Example:

sql

How It's Possible:

Real-world parallel: Strong consistency is like a global conference call (everyone hears same thing, but takes time to coordinate). Eventual consistency is like email (everyone gets message, but at different times).

⚡ Distributed Consensus: How Nodes Agree

The Challenge:

Raft Consensus Algorithm:

Handling Failures:

CockroachDB Example:

sql

Real-world parallel: Raft is like a committee vote. Majority must agree before decision is official. If some members absent, majority of present members still sufficient.

🔧 Distributed Transactions: The Hard Problem

The Two-Phase Commit (2PC):

The Problem with 2PC:

Modern Solutions:

Saga Pattern:

Spanner's Solution:

Real-world parallel: 2PC is like getting signatures from multiple people. If courier (coordinator) lost in transit, everyone waits. Saga is like a reversible process - can undo if something fails.

💡 Sharding in Distributed Databases

Automatic Sharding (CockroachDB):

Manual Sharding (MongoDB):

javascript

Real-world parallel: Automatic sharding is like a valet parking service (handles distribution automatically). Manual sharding is like parking lot sections (you decide where to park).

🌐 Multi-Region Deployment Patterns

Pattern 1: Primary in One Region (Read Replicas Everywhere)

Pattern 2: Regional Primaries (Multi-Region Primary-Primary)

Pattern 3: Global Consensus (Spanner-style)

Real-world parallel:

  • Pattern 1 = Headquarters with branch offices
  • Pattern 2 = Regional headquarters (autonomous)
  • Pattern 3 = Federation (all regions vote on decisions)

💡 Final Synthesis Challenge: The Global Corporation

Complete this comparison: "A single database is like a company in one building. A distributed database is like..."

Your answer should include:

  • Data distribution
  • Consistency guarantees
  • Scalability approach
  • Global availability

Take a moment to formulate your complete answer...

The Complete Picture: A distributed database is like a multinational corporation with offices worldwide:

Headquarters + Branches (Primary-Replica): Central office makes decisions, branches execute locally ✅ Regional Autonomy (Multi-Primary): Each region operates independently, syncs periodically ✅ Federation (Consensus): All offices vote on major decisions (Raft/Paxos) ✅ Departments (Sharding): Each office handles specific customer segments ✅ Redundancy (Replication): Multiple offices have same information ✅ Global Scale: Can serve billions worldwide from nearest location ✅ Coordination: Offices communicate to maintain consistency

Benefits:

  1. Unlimited Scale - Add more nodes = more capacity (horizontal)
  2. Global Availability - Serve users from nearest datacenter (low latency)
  3. Fault Tolerance - One datacenter fails, others continue
  4. Geo-Distribution - Data in multiple locations (compliance, disaster recovery)

Trade-offs:

  • Complexity (distributed coordination)
  • Consistency challenges (or latency cost)
  • More expensive operations (network calls)
  • Harder to reason about (distributed state)

Real-world examples:

  • Spanner (Google): Global ACID transactions for AdWords
  • DynamoDB (Amazon): Massive scale for e-commerce
  • Cassandra (Netflix): Billions of viewing events
  • CockroachDB: Cloud-native distributed SQL
  • MongoDB Atlas: Global document database

When to use:

  • Data > 1TB and growing fast
  • Need global low latency
  • Single server can't handle load
  • High availability critical
  • Can accept complexity trade-off

When NOT to use:

  • Data < 100GB
  • Single region sufficient
  • Team lacks distributed systems expertise
  • Simple application

Distributed databases transform single-server limits into planetary-scale systems!

🎯 Quick Recap: Test Your Understanding Without looking back, can you explain:

  1. How do distributed databases differ from sharded traditional databases?
  2. What is Raft consensus and why is it needed?
  3. How does Google Spanner achieve strong consistency globally?
  4. When should you choose distributed SQL vs NoSQL?

Mental check: If you can design a distributed database architecture, you understand distributed databases!

🚀 Your Next Learning Adventure Now that you understand distributed databases, explore:

Advanced Topics:

  • Vector clocks and conflict resolution
  • CRDTs (Conflict-Free Replicated Data Types)
  • Distributed tracing and observability
  • Jepsen testing (distributed systems testing)

Distributed Databases:

  • CockroachDB deep dive
  • YugabyteDB architecture
  • TiDB (distributed MySQL)
  • FoundationDB (key-value store)

Consensus Algorithms:

  • Raft visualization
  • Paxos algorithm
  • Byzantine fault tolerance
  • Blockchain consensus

Real-World Case Studies:

  • How Uber built Schemaless
  • Discord's database scaling
  • Dropbox's migration to distributed storage
  • How Stripe handles distributed transactions

Key Takeaways

  1. Distributed databases spread data across multiple nodes and regions — enabling horizontal scalability beyond a single machine
  2. CAP theorem constrains distributed database design — you must choose between consistency and availability during network partitions
  3. Consensus protocols (Raft, Paxos) keep distributed nodes in agreement — at the cost of latency for cross-node coordination
  4. CockroachDB, Spanner, and YugabyteDB offer distributed SQL — combining the scalability of NoSQL with SQL's transactional guarantees
Chapter complete!

Course Complete!

You've finished all 42 chapters of

System Design Indermediate

Browse courses
Up next Consistent Hashing
Continue