SYSTEM_DESIGN
System Design: Read Receipts System
System design for a read receipts system covering delivery status tracking (sent, delivered, read), efficient fan-out in group conversations, and handling offline scenarios.
Requirements
Functional Requirements:
- Track three message states: sent (server received), delivered (recipient device received), read (recipient viewed)
- Display status indicators to the sender in real time (single check, double check, blue/read)
- Support read receipts for group conversations (show who has read the message)
- User privacy control: option to disable sending read receipts
- Efficient bulk read receipt marking (scrolling through a conversation marks all visible messages as read)
Non-Functional Requirements:
- Process 100 billion status updates per day (each message generates ~3 status events)
- Status update propagation to the sender within 2 seconds
- 99.9% accuracy: no false 'read' indicators
- Minimal storage overhead: receipts should not exceed 10% of message storage
- Handle offline scenarios gracefully: deliver batched receipts upon reconnection
Scale Estimation
With 100 billion status updates per day (sent + delivered + read events for ~33 billion messages), the system processes 1.16 million status events per second. Each status event is compact: ~48 bytes (message_id + user_id + status + timestamp). Total status data: ~4.8TB per day. In group conversations (average 20 members), a single message generates up to 60 status events (sent + 19 delivered + 19 read + 19 fan-out to sender), creating significant write amplification. Fan-out of status updates to the sender requires approximately 33 billion sender-push events per day (~382K pushes/sec).
High-Level Architecture
The read receipt system is a lightweight event pipeline layered on top of the messaging infrastructure. When the server receives a message from the sender, it immediately returns a 'sent' acknowledgment. When the message is delivered to the recipient's device, the recipient's client sends a 'delivered' event. When the recipient views the message (the message becomes visible in the viewport), the client sends a 'read' event. Each event flows: Client → WebSocket Gateway → Receipt Service → Status Store (Redis) → Fan-out to sender via WebSocket.
The Receipt Service is stateless and horizontally scalable. It receives status events, writes to a Redis hash for the message (receipt:{message_id} → {user_id: status_timestamp}), and emits a fan-out event to the sender's WebSocket Gateway. For group conversations, the service aggregates read receipts: rather than pushing every individual 'User X read your message' event immediately, it batches group receipt updates and pushes a summary ('15 of 20 members have read') every 2 seconds.
Bulk read marking is critical for performance. When a user scrolls through a conversation and 50 messages become visible, the client doesn't send 50 individual read events. Instead, it sends a single 'read up to message_id X' event. The Receipt Service marks all messages in that conversation with ID ≤ X as read for that user in a single batch operation. This reduces the event volume by 10-50x compared to per-message receipts.
Core Components
Receipt Service
The Receipt Service processes incoming status events from clients. For each event, it: (1) validates the event (does the user have access to this message?), (2) writes the status to the Status Store, (3) determines if the sender should be notified (check if sender has privacy settings that care about this status), and (4) emits a notification event. For 'read up to' bulk events, the service resolves the message range by querying the message store for all message IDs in the conversation with ID ≤ the marker, then batch-updates the status store. The service is partitioned by conversation_id for ordering guarantees.
Status Store
The Status Store uses a Redis hash per message for fast reads: receipt:{message_id} with fields {user_id: 'D:1706000000'} where 'D' = delivered and the number is the Unix timestamp, or {user_id: 'R:1706000100'} where 'R' = read. This compact encoding keeps per-message receipt data under 1KB even for groups of 100 members. For persistence, a Cassandra table mirrors the data with partition key conversation_id and clustering key (message_id, user_id), columns status and timestamp. The Cassandra write is asynchronous and does not block the real-time path.
Sender Notification Publisher
The Publisher pushes status updates to the original message sender. For one-on-one conversations, each delivered/read event triggers an immediate push. For group conversations, the publisher aggregates: it maintains a 2-second tumbling window per (sender, conversation) pair, collecting all read receipts in the window and pushing a single summary. This reduces the push volume by 90% in active group conversations. The push payload is minimal: {message_id, status: 'read', read_count: 15, total: 20} for groups, or {message_id, status: 'read', by: user_id} for one-on-one.
Database Design
The primary status store is Redis with two data structures per conversation. First, a hash per message: receipt:{msg_id} → {uid1: 'R:ts', uid2: 'D:ts'}. Second, a per-user read watermark per conversation: watermark:{conv_id}:{user_id} → last_read_msg_id. The watermark enables the 'read up to' optimization: to check if user U has read message M, simply compare M's ID against U's watermark — if M ≤ watermark, it's read, without looking up individual receipt records.
The persistent store in Cassandra uses a denormalized design for read-heavy queries. The receipts table has partition key message_id, clustering key user_id, columns status and timestamp. The watermarks table has partition key (conversation_id, user_id), column last_read_msg_id and last_read_at. Read watermarks are updated in Cassandra every 60 seconds (not on every read event) to reduce write pressure. The TTL on receipt rows matches the message retention policy (e.g., 90 days).
API Design
POST /api/receipts/delivered— Batch mark messages as delivered:{message_ids: [id1, id2, id3]}; sent by client on message receiptPOST /api/receipts/read— Mark conversation as read up to a message:{conversation_id, up_to_message_id}; marks all prior messages as readGET /api/receipts/{message_id}— Fetch read receipt details for a message: returns{delivered: [{user_id, at}], read: [{user_id, at}]}WebSocket EVENT {type: 'receipt_update', message_id, status, read_count?, by?}— Server push to sender when recipients read their message
Scaling & Bottlenecks
The highest-volume operation is the bulk read marking ('read up to') which must update potentially hundreds of messages atomically. The watermark-based approach avoids this entirely: instead of marking each message individually, only the watermark pointer is updated. Queries check message_id <= watermark rather than looking up per-message receipt records. This reduces write amplification from O(messages_per_conversation) to O(1) per scroll event.
Redis memory is the capacity constraint. With 33 billion messages per day and receipt hashes consuming ~200 bytes per message (for a 20-member group), the hot receipt data would require ~6.6TB of Redis memory. The solution is aggressive TTL management: receipt hashes are set to expire 24 hours after the last update (most receipts are checked within hours of message delivery). Older receipt data is served from Cassandra on demand. This reduces the Redis working set to approximately 500GB, manageable across a 20-node Redis cluster.
Key Trade-offs
- Watermark-based over per-message receipts: Watermarks reduce write amplification from O(N) to O(1) per read event, but lose per-message granularity — you can tell if a message is read, but not exactly when each individual message was read (only that it was read before the watermark advanced)
- Aggregated group receipt pushes over individual pushes: Batching receipt updates every 2 seconds reduces fan-out by 90% in active groups, but means the sender sees 'read by 15' jumping in batches rather than smoothly incrementing
- Asynchronous Cassandra persistence over synchronous writes: Writing receipts to Redis first and Cassandra async eliminates the Cassandra write from the latency path, but creates a risk window where receipt data exists only in Redis — mitigated by Redis replication
- Privacy toggle at cost of consistency: Allowing users to disable read receipts adds a check on every receipt event and means the sender's view may be incomplete, but is essential for user trust and platform adoption
GO DEEPER
Master this topic in our 12-week cohort
Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.