Additional Database Concepts · Chapter 34 of 42

Master Slave Archiecture

Akhil Sharma

 20 min 

← → to navigate

Master-Slave Architecture

The simplest database scaling pattern — one master handles all writes, replicas handle reads, and automated failover keeps things running when the master goes down.

Master-Slave Architecture: The Classic Database Pattern (Now Called Primary-Replica) 🎯 Challenge 1: The Single Chef Kitchen Problem Imagine this scenario: You run a restaurant with one chef who does everything.

Single Chef (No Delegation):

yaml

Head Chef + Sous Chefs (Master-Slave Pattern):

yaml

Pause and think: What if your database had one server for writes and many servers for reads?

The Answer: Master-Slave architecture (now called Primary-Replica) separates write and read responsibilities! It's like: ✅ Master/Primary = Single source of truth (handles all writes) ✅ Slaves/Replicas = Copies that follow master (handle reads) ✅ Data flows one direction (master → slaves) ✅ Automatic promotion on failure (slave becomes master) ✅ Read scalability (add more slaves = more read capacity)

Terminology Note: The industry is transitioning from "Master-Slave" to "Primary-Replica" for inclusivity. They mean the same thing. Modern documentation uses Primary-Replica.

Key Insight: This pattern trades write scalability for read scalability - you can't scale writes, but you can scale reads infinitely!

🎬 Interactive Exercise: Write vs Read Workloads

Understanding the 90/10 Rule:

sql

Single Database (No Replication):

Primary-Replica (With Replication):

The Math:

Real-world parallel: Primary-Replica is like a company with one CEO (makes decisions) and many managers (answer questions). CEO isn't overwhelmed because managers handle most inquiries.

🏗️ How Primary-Replica Works (The Details)

The Write Path:

sql

The Read Path:

sql

Replication Modes:

Asynchronous (Default):

yaml

Synchronous:

yaml

Semi-synchronous (Best of Both):

yaml

Real-world parallel:

Async = Mail drop box (drop and go, delivered later)
Synchronous = Certified mail (wait for signature)
Semi-sync = Quick signature from receptionist (fast but confirmed)

🎮 Decision Game: Which Replication Mode?

Context: You're configuring replication. Which mode should you use?

Scenarios: A. Social media likes counter B. Bank account balance C. Blog post content D. Shopping cart contents E. User session data F. Financial transaction log G. Product catalog H. Real-time analytics dashboard

Options:

Asynchronous (fast, might lose recent writes)
Semi-synchronous (balanced)
Synchronous (slow, no data loss)

Think about: Can you afford to lose data? Need fast writes?

Answers:

Configuration Examples:

sql

🚨 Common Misconception: "Replicas Are Identical to Primary... Right?"

You might think: "Replica = exact copy at all times."

The Reality: Replicas are eventually consistent!

Replication Lag Scenario:

Real-World Bug:

python

Solutions:

Solution 1: Read Your Own Writes

python

Solution 2: Sticky Sessions to Primary

python

Solution 3: Check Replication Position

python

Real-world parallel: Replication lag is like news propagation. Breaking news (primary) takes time to reach newspapers (replicas).

⚡ Failover: Promoting a Replica to Primary

Automatic Failover Flow:

When Old Primary Returns:

Split-Brain Prevention:

Tools for Automatic Failover:

MySQL with MHA (Master High Availability):

bash

PostgreSQL with Patroni:

yaml

Real-world parallel: Automatic failover is like a monarchy with clear succession rules. When king dies, crown prince automatically becomes king.

💡 Read-Write Splitting in Application

Application Pattern:

python

Django Database Router:

python

Real-world parallel: Read-write splitting is like having separate checkout (write) and browsing (read) in a store. Browsing happens anywhere, checkout at specific counter.

💡 Final Synthesis Challenge: The Organization Structure

Complete this comparison: "A database without primary-replica is like a company with no delegation. A primary-replica database is like..."

Your answer should include:

Write vs read separation
Scalability benefits
High availability
Trade-offs

Take a moment to formulate your complete answer...

The Complete Picture: A primary-replica database is like a well-organized company with clear delegation:

✅ Primary (CEO): Makes all decisions (writes), source of truth ✅ Replicas (Managers): Answer questions (reads), spread across locations ✅ Delegation: CEO not overwhelmed (writes are minority of workload) ✅ Scalability: Add more managers (replicas) = handle more inquiries (reads) ✅ Succession: CEO leaves, VP promoted (automatic failover) ✅ Global presence: Managers in each region (low latency reads worldwide) ✅ Eventually consistent: New policy (write) takes time to reach all managers

Benefits:

Read Scalability - Add replicas = handle more read traffic
High Availability - Primary fails, promote replica
Disaster Recovery - Data exists on multiple servers
Geographic Distribution - Serve users from nearest replica
Offload Primary - Primary focuses on writes

Trade-offs:

Replication lag (eventual consistency)
Writes don't scale (still bottleneck on primary)
More complex application code (read-write splitting)
Split-brain risk (need proper failover)

Real-world examples:

Reddit: Read replicas for comment browsing
Stack Overflow: Replicas in multiple data centers
WordPress.com: Read replicas for blog views
Netflix: Replicas for catalog browsing

Primary-replica transforms single-server databases into scalable, highly available systems!

🎯 Quick Recap: Test Your Understanding Without looking back, can you explain:

Why is it called primary-replica instead of master-slave now?
How does primary-replica handle read-heavy workloads?
What is replication lag and when does it matter?
How does automatic failover work?

Mental check: If you can design a primary-replica system, you understand the pattern!

🚀 Your Next Learning Adventure Now that you understand primary-replica, explore:

Advanced Topics:

Multi-primary replication
Cascading replication
Delayed replicas (for recovery from mistakes)
Logical replication

Replication Tools:

ProxySQL (MySQL read-write splitting)
PgBouncer (PostgreSQL connection pooling)
HAProxy (database load balancing)
Orchestrator (MySQL topology management)

Related Concepts:

Database sharding
Distributed databases
CQRS (Command Query Responsibility Segregation)
Event sourcing

Real-World Case Studies:

How Wikipedia handles millions of readers
GitHub's database architecture
Instagram's replica lag handling
Twitter's database evolution

Key Takeaways

Master-slave architecture routes all writes to one master and reads to replicas — simple and effective for read-heavy workloads
The master is a single point of failure — automated failover with leader election is essential for production systems
Replica lag means reads may return stale data — acceptable for most use cases but not for operations like checking account balance after a transfer
Semi-synchronous replication balances consistency and performance — at least one replica confirms before the master commits

Previous Database Sharding Up next Database Replication

Chapter complete!

Up next Database Replication

Continue

Master Slave Archiecture

Master-Slave Architecture

Key Takeaways

Course Complete!