Idempotency Explained: Designing Safe Retries in Distributed Systems
How idempotency works — idempotency keys, at-least-once delivery, exactly-once semantics, and how Stripe, AWS, and Kafka handle duplicate requests.
Idempotency
An operation is idempotent if performing it multiple times produces the same result as performing it once, making it safe to retry without unintended side effects.
What It Really Means
In distributed systems, messages get lost, connections time out, and clients retry. When a client sends a payment request and the connection drops after the server processes it but before the client receives the response, the client does not know if the payment went through. It retries. Without idempotency, the customer gets charged twice.
Idempotency guarantees that retrying an operation is safe. The server recognizes the duplicate request and returns the original result without executing the operation again. This is not just a nice-to-have — it is essential for any system that handles money, inventory, or state changes over unreliable networks.
The concept comes from mathematics: a function f is idempotent if f(f(x)) = f(x). In API design, it means: calling the same endpoint with the same parameters any number of times produces the same outcome. PUT /users/42 {name: "Alice"} is naturally idempotent — setting the name to Alice ten times is the same as setting it once. POST /payments {amount: 50} is not — submitting it ten times charges $500.
How It Works in Practice
Idempotency Keys
The most common pattern: the client generates a unique identifier (idempotency key) for each logical operation and includes it in the request. The server stores the key and the result. On retry, the server finds the stored key and returns the cached result without re-executing.
Stripe requires idempotency keys for all POST requests:
If the client retries with the same Idempotency-Key, Stripe returns the original charge response without creating a second charge. Keys expire after 24 hours.
Amazon SQS uses message deduplication IDs. When a producer sends a message with a deduplication ID, SQS drops duplicates with the same ID within a 5-minute window.
Naturally Idempotent Operations
Some HTTP methods are idempotent by design:
- GET: Reading data does not change state
- PUT: Setting a value to X is the same regardless of how many times you do it
- DELETE: Deleting a resource that is already deleted has no additional effect
Non-idempotent by default:
- POST: Creating a resource or triggering an action — each call may create a new resource
- PATCH: Incrementing a counter — each call adds to the total
Kafka Exactly-Once Semantics
Kafka achieves exactly-once delivery using idempotent producers. Each producer is assigned a Producer ID (PID), and each message has a sequence number. The broker tracks the sequence number per PID and partition. If a broker receives a message with a sequence number it has already seen, it drops the duplicate.
AWS and GCP
AWS uses client tokens for idempotent API calls. RunInstances with a client token prevents launching duplicate EC2 instances on retry.
Google Cloud Pub/Sub uses message IDs for deduplication within a configurable window.
Implementation
Trade-offs
Advantages
- Safe retries: Clients can retry freely without causing duplicate side effects
- Simplified error handling: Clients do not need to distinguish between "request failed" and "request succeeded but response was lost"
- Network resilience: Works seamlessly over unreliable connections (mobile networks, inter-service calls)
- Exactly-once semantics: Combined with at-least-once delivery, achieves effectively exactly-once processing
Disadvantages
- Storage overhead: Idempotency keys and their results must be stored (typically 24-48 hours)
- Key management complexity: Clients must generate and manage unique keys correctly
- Concurrency challenges: Two identical requests arriving simultaneously must be handled (use locking or compare-and-swap)
- TTL decisions: Too short and late retries fail. Too long and storage grows. 24 hours is a common compromise.
Storage Strategies for Idempotency Keys
| Strategy | Pros | Cons |
|---|---|---|
| In-memory (Redis) | Fast, TTL support built-in | Lost on restart (acceptable with TTL) |
| Database table | Durable, transactional with the main operation | Slower, schema migration needed |
| Distributed cache | Scales horizontally | Eventual consistency risk |
Common Misconceptions
- "GET requests do not need idempotency consideration" — True for pure reads, but some "GET" endpoints trigger side effects (logging, analytics events, rate limit counters). If your GET has side effects, you have a design problem.
- "Idempotency keys should be derived from the request body" — If you hash the request body to generate the key, two different intentional requests with identical bodies (e.g., two $50 charges to the same user) would be deduplicated incorrectly. The client should generate a unique key for each intended operation.
- "Idempotency means the response is always the same" — The side effect is the same (payment charged once), but the response might differ (first call returns 201 Created, retry returns 200 OK with the cached result).
- "At-most-once delivery is the same as idempotency" — At-most-once drops duplicates but may lose the original. Idempotency processes the first request and returns the cached result for duplicates.
- "You only need idempotency for payment systems" — Any mutating operation benefits: sending emails, creating accounts, provisioning infrastructure, publishing messages. If a retry can cause harm, you need idempotency.
How This Appears in Interviews
Idempotency comes up in API design and system design interviews:
- "A customer was charged twice. How do you prevent this?" — Idempotency keys on the payment API. The client generates a unique key per payment intent and includes it in every request (including retries).
- "Design an API for a banking transfer" — POST with an idempotency key. Store the key in the transactions table with a unique constraint. Use database transactions to make the debit, credit, and key storage atomic.
- "How does Kafka achieve exactly-once processing?" — Idempotent producers (PID + sequence number) prevent duplicate messages. Combined with transactions, this extends to exactly-once stream processing.
- "Your microservice calls another service that times out. What do you do?" — Retry with the same idempotency key. The downstream service recognizes the duplicate and returns the cached result.
- "How do you handle duplicate webhook deliveries?" — Store processed webhook IDs. Before processing, check if the ID has been seen. This is idempotent consumption.
See our interview questions on API design for more practice.
Related Concepts
- Write-Ahead Log — WAL replay must be idempotent for correct crash recovery
- Replication — idempotent replication prevents duplicate data on replicas
- Quorum — quorum writes with retries require idempotency at the storage level
- Consistent Reads — idempotency ensures writes are applied once, reads reflect them consistently
- System Design Interview Guide
- Algoroq Pricing — practice API design and distributed systems concepts
GO DEEPER
Learn from senior engineers in our 12-week cohort
Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.