// CONCEPTS

Engineering Concepts Glossary

Core concepts every engineer should understand — from distributed systems fundamentals to architecture patterns.

CAP Theorem Explained: Consistency, Availability, and Partition Tolerance

A clear, practical explanation of the CAP theorem — what it really means, how it applies to real distributed systems, common misconceptions, and how to discuss it in system design interviews.

cap-theoremdistributed-systemsconsistency

RAG Explained: Retrieval-Augmented Generation for LLM Applications

A practical guide to Retrieval-Augmented Generation — how RAG works, when to use it over fine-tuning, implementation patterns, and production pitfalls to avoid.

ragllmretrieval

Vector Embeddings Explained: How Machines Understand Meaning

Learn how vector embeddings work, why they power modern AI search and RAG systems, how to choose embedding models, and common pitfalls in production.

vector-embeddingsembeddingssemantic-search

Prompt Engineering Explained: The Art and Science of Guiding LLMs

Master prompt engineering techniques — from zero-shot to chain-of-thought prompting, with practical patterns, anti-patterns, and interview preparation tips.

prompt-engineeringllmai-engineering

Fine-Tuning vs RAG Explained: Choosing the Right LLM Customization Strategy

Compare fine-tuning and RAG for LLM customization — when each approach wins, cost analysis, implementation complexity, and decision frameworks.

fine-tuningragllm

Multi-Agent Systems Explained: Orchestrating Autonomous AI Workflows

Understand multi-agent AI systems — architectures, orchestration patterns, inter-agent communication, and when agents outperform single-prompt approaches.

multi-agentai-agentsorchestration

MCP Explained: Model Context Protocol for LLM Tool Integration

Understand the Model Context Protocol (MCP) — how it standardizes LLM-tool interaction, server architecture, and why it matters for AI engineering.

mcpmodel-context-protocolllm-tools

LLM Serving Explained: Deploying Language Models at Scale

Learn LLM serving infrastructure — batching strategies, KV cache optimization, quantization, and choosing between self-hosted and API-based deployments.

llm-servinginferencedeployment

Token Budgeting Explained: Managing LLM Costs and Context Windows

Master token budgeting for LLM applications — context window management, cost optimization strategies, prompt compression, and production best practices.

token-budgetingllmcost-optimization

Semantic Search Explained: Beyond Keyword Matching with AI

Understand semantic search — how it uses embeddings to find meaning-based matches, implementation with vector databases, and when it beats keyword search.

semantic-searchvector-searchembeddings

Transformer Architecture Explained: The Engine Behind Modern AI

Understand the transformer architecture — self-attention, positional encoding, encoder-decoder structure, and why transformers revolutionized NLP and beyond.

transformerattentiondeep-learning

Attention Mechanism Explained: How LLMs Focus on What Matters

Deep dive into the attention mechanism — scaled dot-product attention, multi-head attention, self-attention vs cross-attention, and key optimizations.

attentionself-attentiontransformer

Embedding Models Explained: Choosing the Right Model for Your AI Application

Compare embedding models for search, RAG, and classification — model selection criteria, benchmarks, fine-tuning strategies, and production deployment tips.

embedding-modelsembeddingssentence-transformers

Chunking Strategies for RAG Explained: How to Split Documents for Optimal Retrieval

Learn RAG chunking strategies — fixed-size, semantic, recursive, and parent-document chunking with practical guidelines for chunk size and overlap.

chunkingragtext-splitting

Hallucination in LLMs Explained: Why AI Models Make Things Up

Understand LLM hallucination — why models fabricate facts, detection techniques, mitigation strategies with RAG and guardrails, and evaluation methods.

hallucinationllmreliability

AI Guardrails Explained: Building Safe and Reliable LLM Applications

Learn how to implement AI guardrails — input validation, output filtering, content moderation, jailbreak prevention, and production safety patterns.

ai-guardrailsai-safetycontent-moderation

Read Replicas Explained: Scaling Database Reads Without Sharding

How read replicas work — replication lag, consistency trade-offs, routing strategies, and when to use replicas vs caching or sharding for read scaling.

read-replicasreplicationdatabases

Connection Pooling Explained: Why Opening a New Database Connection Is Expensive

How database connection pooling works — why connections are expensive, pool sizing, PgBouncer vs application-level pools, and common misconfigurations.

connection-poolingdatabasesperformance

Materialized Views Explained: Precomputed Query Results for Fast Reads

How materialized views work — when to use them over regular views, refresh strategies, and real-world use cases for dashboards, reporting, and APIs.

materialized-viewsdatabasesquery-optimization

Change Data Capture Explained: Streaming Database Changes in Real Time

How Change Data Capture (CDC) works — Debezium, WAL-based capture, event-driven architectures, and keeping derived data stores in sync with your database.

cdcchange-data-captureevent-streaming

Write-Ahead Logging Explained: How Databases Survive Crashes

How WAL (Write-Ahead Logging) works — why databases write logs before data, crash recovery, checkpointing, and performance implications for durability.

walwrite-ahead-loggingdatabases

MVCC Explained: Multi-Version Concurrency Control in Databases

How MVCC lets databases handle concurrent reads and writes without locking — version chains, snapshot isolation, vacuum, and performance implications.

mvccconcurrency-controldatabases

Database Transactions Explained: Commit, Rollback, and Isolation in Practice

How database transactions work — BEGIN, COMMIT, ROLLBACK, savepoints, isolation levels, and common pitfalls with practical PostgreSQL and MySQL examples.

transactionsdatabasesisolation-levels

Optimistic vs Pessimistic Locking Explained: Concurrency Control Strategies

When to use optimistic vs pessimistic locking — version-based conflict detection vs exclusive locks, with real-world examples and implementation patterns.

lockingconcurrencyoptimistic-locking

Snowflake ID vs UUID Explained: Distributed ID Generation Strategies

Comparing Snowflake IDs and UUIDs for distributed systems — sortability, collision probability, database indexing impact, and choosing the right ID strategy.

snowflake-iduuiddistributed-systems

Time-Series Data Modeling Explained: Storage, Indexing, and Query Patterns

How to model time-series data effectively — partitioning by time, downsampling, retention policies, and choosing between TimescaleDB, InfluxDB, and Cassandra.

time-seriesdata-modelingdatabases

DNS Resolution Explained: How Domain Names Become IP Addresses

How DNS resolution works step by step — recursive resolvers, authoritative servers, caching, TTL, and why DNS failures take down the internet.

dnsnetworkingdomain-resolution

TCP Three-Way Handshake Explained: How Connections Are Established

How the TCP three-way handshake works — SYN, SYN-ACK, ACK sequence, why it exists, connection states, and how it affects application latency.

tcpnetworkinghandshake

TLS/SSL Handshake Explained: How HTTPS Connections Are Secured

How the TLS handshake establishes encrypted connections — certificate verification, key exchange, TLS 1.2 vs 1.3, and performance implications for HTTPS.

tlssslhttps

HTTP/2 Multiplexing Explained: Multiple Requests Over One Connection

How HTTP/2 multiplexing solves head-of-line blocking — streams, frames, server push, and why HTTP/2 is faster than HTTP/1.1 for modern web applications.

http2multiplexingnetworking

WebSocket Protocol Explained: Full-Duplex Communication Over TCP

How WebSockets work — the upgrade handshake, frame format, when to use WebSockets vs SSE or polling, and scaling WebSocket connections in production.

websocketreal-timenetworking

Server-Sent Events Explained: One-Way Real-Time Streaming Over HTTP

How Server-Sent Events (SSE) work — EventSource API, automatic reconnection, and when to choose SSE over WebSockets for real-time server-to-client updates.

sseserver-sent-eventsreal-time

Long Polling Explained: Real-Time Updates Without WebSockets

How long polling works — holding HTTP connections open for server-push updates, timeout handling, and when long polling beats WebSockets or SSE.

long-pollingreal-timenetworking

CDN and Edge Computing Explained: Serving Content From the Nearest Location

How CDNs and edge computing work — caching layers, cache invalidation, edge functions, and designing systems that leverage geographic distribution.

cdnedge-computingcaching

CORS Explained: Cross-Origin Resource Sharing and Browser Security

How CORS works — preflight requests, Access-Control headers, why browsers block cross-origin requests, and how to configure CORS correctly for your API.

corssecuritynetworking

What Happens When You Type a URL Explained: The Full Request Lifecycle

The complete journey from typing a URL to seeing a web page — DNS, TCP, TLS, HTTP, rendering, and every step in between explained for system design interviews.

url-lifecyclenetworkingdns

Tail Latency Explained: Why P99 Matters More Than Average Response Time

Understanding tail latency — why p99 and p999 percentiles matter, what causes latency spikes, and how to measure and reduce tail latency in production systems.

tail-latencyp99performance

Back-of-Envelope Estimation Explained: Quick Math for System Design

How to do back-of-envelope calculations in system design interviews — latency numbers, storage estimates, throughput math, and the key numbers every engineer should know.

estimationsystem-designcapacity-planning

SLOs, SLIs, and SLAs Explained: Measuring and Guaranteeing Reliability

The difference between SLOs, SLIs, and SLAs — how to define reliability targets, measure them with error budgets, and use them in system design interviews.

sloslisla

Chaos Engineering Explained: Breaking Systems to Make Them Stronger

How chaos engineering works — injecting failures in production to discover weaknesses, the principles behind Netflix's Chaos Monkey, and building resilient systems.

chaos-engineeringresiliencereliability

Blue-Green vs Canary Deployments Explained: Safe Release Strategies

How blue-green and canary deployment strategies work — traffic shifting, rollback speed, infrastructure costs, and choosing the right strategy for your system.

deploymentblue-greencanary

Consistent Hashing Explained: Distributing Data Without Reshuffling Everything

Learn how consistent hashing distributes data across nodes with minimal disruption when nodes join or leave, with real examples from DynamoDB and Cassandra.

consistent-hashingdistributed-systemsload-balancing

Raft Consensus Algorithm Explained: Making Distributed Nodes Agree

Understand the Raft consensus algorithm — leader election, log replication, and safety guarantees, with implementation details and interview tips.

raftconsensusdistributed-systems

Paxos Consensus Protocol Explained: The Foundation of Distributed Agreement

Demystify the Paxos consensus protocol — proposers, acceptors, and learners, with practical examples from Google Chubby and real interview scenarios.

paxosconsensusdistributed-systems

Vector Clocks Explained: Tracking Causality in Distributed Systems

Understand vector clocks — how they capture causal ordering of events across distributed nodes, detect conflicts, and compare to Lamport timestamps.

vector-clocksdistributed-systemscausality

Gossip Protocol Explained: How Distributed Nodes Share Information Like Rumors

Learn how gossip protocols propagate information across distributed clusters with epidemic-style communication, used by Cassandra, Consul, and SWIM.

gossip-protocoldistributed-systemsmembership

Circuit Breaker Pattern Explained: Preventing Cascading Failures in Distributed Systems

Master the circuit breaker pattern for distributed systems — states, transitions, implementation with real examples from Netflix Hystrix and Resilience4j.

circuit-breakerdistributed-systemsresilience

Saga Pattern Explained: Managing Distributed Transactions Without Two-Phase Commit

Learn the saga pattern for distributed transactions — choreography vs orchestration, compensating actions, and real examples from e-commerce systems.

saga-patterndistributed-transactionsmicroservices

CQRS Explained: Separating Reads and Writes for Scalable Systems

Understand CQRS (Command Query Responsibility Segregation) — why separating read and write models enables scalability, with practical implementation.

cqrsdistributed-systemsevent-driven

Event Sourcing Explained: Storing What Happened Instead of Current State

Learn event sourcing — storing every state change as an immutable event, with real examples from banking, e-commerce, and event-driven architectures.

event-sourcingdistributed-systemsevent-driven

Two-Phase Commit Protocol Explained: Coordinating Distributed Transactions

Understand the Two-Phase Commit (2PC) protocol — how it coordinates atomic transactions across distributed nodes, its blocking problem, and alternatives.

two-phase-commitdistributed-transactionsdistributed-systems

Eventual Consistency Explained: When Good Enough Consistency Beats Perfect Consistency

Learn eventual consistency — what it guarantees, how it differs from strong consistency, real-world examples from DNS and DynamoDB, and interview strategies.

eventual-consistencydistributed-systemsconsistency

Database Sharding Explained: Splitting Data Across Multiple Databases

Master database sharding — partitioning strategies, shard key selection, rebalancing challenges, and real examples from Instagram, Discord, and Vitess.

shardingdistributed-systemsdatabases

Database Replication Explained: Keeping Data in Sync Across Nodes

How database replication works in distributed systems — synchronous vs asynchronous, leader-follower vs multi-leader, replication lag, and production trade-offs.

replicationdistributed-systemsdatabases

Load Balancing Explained: Distributing Traffic Across Servers

How load balancing works — algorithms, health checks, Layer 4 vs Layer 7, sticky sessions, and how Netflix and Google distribute billions of requests.

load-balancingdistributed-systemsscalability

Leader Election Explained: Choosing a Coordinator in Distributed Systems

How leader election works — Raft, Bully, and ZAB algorithms, why distributed systems need leaders, failure detection, and split-brain prevention.

leader-electiondistributed-systemsconsensus

Quorum in Distributed Systems Explained: Majority Rules for Consistency

How quorum works in distributed systems — read/write quorums, the W+R>N formula, sloppy quorums, and how Cassandra and DynamoDB use them.

quorumdistributed-systemsconsistency

Heartbeat Mechanism Explained: Detecting Failures in Distributed Systems

How heartbeat mechanisms work — failure detection, timeout tuning, phi accrual detectors, gossip protocols, and how Kafka and Kubernetes use heartbeats.

heartbeatdistributed-systemsfailure-detection

Bloom Filters Explained: Probabilistic Set Membership Testing

How Bloom filters work — hash functions, false positives, sizing formulas, and how Google, Cassandra, and CDNs use them to avoid expensive lookups.

bloom-filtersdata-structuresdistributed-systems

Merkle Trees Explained: Verifying Data Integrity at Scale

How Merkle trees work — hash trees for efficient data verification, anti-entropy repair, and how Git, Bitcoin, Cassandra, and IPFS use them.

merkle-treesdata-structuresdistributed-systems

Consistent Reads Explained: Getting Fresh Data from Replicated Systems

How to achieve consistent reads in distributed databases — read-after-write consistency, monotonic reads, strategies for handling replication lag.

consistent-readsdistributed-systemsconsistency

Partition Tolerance Explained: Surviving Network Failures in Distributed Systems

How partition tolerance works — why network partitions are inevitable, CAP theorem implications, partition handling strategies, and real-world examples.

partition-tolerancedistributed-systemscap-theorem

Split-Brain Problem Explained: When Distributed Systems Disagree on Who Is in Charge

How split-brain occurs in distributed systems — causes, consequences, fencing tokens, STONITH, quorum-based prevention, and real-world outage examples.

split-braindistributed-systemsfault-tolerance

Write-Ahead Log (WAL) Explained: Durability Before Performance

How write-ahead logging works — crash recovery, log structure, checkpointing, and how PostgreSQL, SQLite, Kafka, and etcd use WAL for durability.

write-ahead-logdistributed-systemsdatabases

Idempotency Explained: Designing Safe Retries in Distributed Systems

How idempotency works — idempotency keys, at-least-once delivery, exactly-once semantics, and how Stripe, AWS, and Kafka handle duplicate requests.

idempotencydistributed-systemsapi-design

Microservices Architecture Explained: Building Systems as Independent Services

Learn how microservices architecture works, when to use it, deployment patterns, real-world trade-offs, and how to discuss it in system design interviews.

microservicesarchitecturedistributed-systems

Monolith vs Microservices Explained: Choosing the Right Architecture

A practical comparison of monolithic and microservices architectures — when each wins, migration strategies, and how to reason about it in interviews.

monolithmicroservicesarchitecture

Service Mesh Explained: Infrastructure for Microservices Communication

Understand how service meshes handle traffic management, security, and observability between microservices, with real-world examples and trade-offs.

service-meshistioenvoy

API Gateway Pattern Explained: The Front Door to Your Microservices

Learn how API gateways route, secure, and manage traffic from external clients to microservices, with implementation patterns and interview tips.

api-gatewaymicroservicesarchitecture

Event-Driven Architecture Explained: Building Reactive Distributed Systems

Learn how event-driven architecture decouples services through asynchronous events, with patterns like event sourcing, CQRS, and real-world trade-offs.

event-drivenarchitecturekafka

Hexagonal Architecture Explained: Ports and Adapters for Clean Systems

Understand hexagonal architecture (ports and adapters) — how it isolates business logic from infrastructure, with practical code examples and trade-offs.

hexagonal-architectureports-and-adaptersclean-architecture

Domain-Driven Design Explained: Modeling Software Around Business Domains

Learn domain-driven design (DDD) fundamentals — bounded contexts, aggregates, ubiquitous language, and how DDD shapes microservices boundaries.

domain-driven-designdddarchitecture

Strangler Fig Pattern Explained: Safely Migrating from Monolith to Microservices

Learn the strangler fig pattern for incrementally replacing legacy systems — migration strategies, routing techniques, and avoiding big-bang rewrites.

strangler-figmigrationmonolith

Sidecar Pattern Explained: Extending Services Without Changing Code

Understand the sidecar pattern for attaching cross-cutting functionality to services — logging, networking, security — without modifying application code.

sidecar-patternservice-meshmicroservices

Backend for Frontend (BFF) Pattern Explained: Client-Specific API Layers

Learn the BFF pattern for building client-specific backends — why one API does not fit all, implementation strategies, and system design interview tips.

bff-patternapi-designmicroservices

Clean Architecture Explained: Decoupling Business Logic from Frameworks and Infrastructure

How Clean Architecture works — dependency inversion, layers, use cases, and why your business logic should never depend on frameworks or databases.

clean-architecturesoftware-architecturedependency-inversion

Serverless Architecture Explained: When Functions Replace Servers

How serverless architecture works — cold starts, event-driven execution, cost models, and when serverless saves money vs when it becomes expensive.

serverlessaws-lambdacloud-architecture

Publish-Subscribe Pattern Explained: Decoupling Producers from Consumers at Scale

How pub/sub works — topics, subscriptions, message ordering, at-least-once delivery, and real-world patterns with Kafka, SNS, and Google Pub/Sub.

pub-submessagingevent-driven

Observer Pattern Explained: Reacting to State Changes Without Tight Coupling

How the observer pattern works — subjects, observers, event-driven updates, and why it underpins reactive UIs, event buses, and pub/sub systems.

observer-patterndesign-patternsevent-driven

SOLID Principles Explained: Five Rules for Maintainable Object-Oriented Design

How SOLID principles work — Single Responsibility, Open/Closed, Liskov Substitution, Interface Segregation, and Dependency Inversion with real examples.

solid-principlesobject-oriented-designsoftware-architecture

The Twelve-Factor App Explained: A Methodology for Building Deployable Software

The Twelve-Factor App methodology — codebase, dependencies, config, backing services, build/release/run, and why modern cloud apps follow these rules.

twelve-factor-appcloud-nativedevops

Bulkhead Pattern Explained: Isolating Failures to Protect the Whole System

How the bulkhead pattern works — thread pool isolation, connection limits, service partitioning, and why one slow dependency should never crash everything.

bulkhead-patternresiliencefault-isolation

Retry with Exponential Backoff Explained: Handling Transient Failures Gracefully

How retry with exponential backoff works — jitter, max retries, idempotency requirements, and why naive retries cause thundering herd failures.

retryexponential-backoffresilience

Transactional Outbox Pattern Explained: Reliable Event Publishing from Database Transactions

How the transactional outbox pattern works — dual-write problem, outbox table, CDC-based publishing, and guaranteed event delivery in microservices.

outbox-patternevent-drivenmicroservices

Anti-Corruption Layer Explained: Protecting Your Domain from External System Complexity

How the Anti-Corruption Layer pattern works — translating between bounded contexts, isolating legacy systems, and keeping your domain model clean.

anti-corruption-layerdomain-driven-designsoftware-architecture

ACID Properties Explained: The Foundation of Reliable Database Transactions

Understand ACID properties in databases — Atomicity, Consistency, Isolation, Durability — with real-world examples, isolation levels, and interview tips.

aciddatabasestransactions

BASE Properties Explained: Basically Available, Soft State, Eventually Consistent

Learn BASE properties for distributed databases — how they differ from ACID, why NoSQL systems adopt them, and when eventual consistency is the right choice.

basedatabaseseventual-consistency

Database Indexing Explained: B-Tree, LSM Tree, and Hash Index Internals

Understand database indexing internals — B-Tree, LSM Tree, and Hash indexes — with performance characteristics, SQL examples, and interview guidance.

database-indexingb-treelsm-tree

Normalization vs Denormalization Explained: Database Schema Design Trade-offs

Learn when to normalize or denormalize your database — with normal forms, real-world examples, SQL patterns, and practical decision frameworks.

normalizationdenormalizationschema-design

Database Partitioning Explained: Range, Hash, List, and Composite Strategies

Master database partitioning strategies — range, hash, list, and composite — with practical examples, partition pruning, and system design applications.

database-partitioningshardinghorizontal-scaling