HTTP/2 Multiplexing Explained: Multiple Requests Over One Connection
How HTTP/2 multiplexing solves head-of-line blocking — streams, frames, server push, and why HTTP/2 is faster than HTTP/1.1 for modern web applications.
HTTP/2 Multiplexing
HTTP/2 multiplexing allows multiple HTTP requests and responses to be sent simultaneously over a single TCP connection using interleaved binary frames, eliminating the head-of-line blocking problem in HTTP/1.1.
What It Really Means
In HTTP/1.1, each TCP connection can only process one request at a time. If your web page needs 50 resources (HTML, CSS, JavaScript, images), the browser opens 6 parallel TCP connections (the browser limit) and processes requests sequentially on each. Request 7 waits until request 1 finishes. This is head-of-line blocking.
HTTP/2 solves this by multiplexing: all 50 requests are sent over a single TCP connection, interleaved as binary frames. The server responds to them concurrently, sending pieces of different responses as they become available. No request blocks another. No extra TCP connections are needed.
This single change dramatically improves page load times, especially on high-latency networks where the cost of establishing multiple TCP connections and TLS handshakes is significant.
How It Works in Practice
HTTP/1.1 vs HTTP/2
Binary Framing
HTTP/2 splits each request/response into binary frames. Multiple streams (request-response pairs) are multiplexed on the connection:
Key Features Beyond Multiplexing
Header compression (HPACK): HTTP headers are compressed using a shared dictionary. Repeated headers (cookies, user-agent) are sent as tiny references instead of full strings. This can reduce header size by 85-95%.
Stream prioritization: The client can signal which resources are more important. CSS and JavaScript critical for rendering get higher priority than below-the-fold images.
Server push: The server can proactively send resources the client will need. When the client requests index.html, the server can push style.css and app.js without waiting for the client to request them.
Implementation
Nginx HTTP/2 configuration:
Verifying HTTP/2:
gRPC uses HTTP/2 natively:
Trade-offs
HTTP/2 advantages:
- Single connection per domain (fewer TCP handshakes and TLS negotiations)
- No head-of-line blocking at the HTTP level
- Header compression reduces bandwidth
- Server push eliminates round trips for critical resources
HTTP/2 limitations:
- TCP-level head-of-line blocking persists (a lost packet blocks all streams)
- Server push is poorly adopted (often disabled by CDNs)
- Domain sharding optimization from HTTP/1.1 becomes counterproductive
- Requires HTTPS in browsers (technically optional in the spec, but browsers enforce it)
HTTP/2 vs HTTP/3 (QUIC):
- HTTP/3 uses QUIC (UDP-based) instead of TCP
- Eliminates TCP-level head-of-line blocking
- Built-in connection migration (survives network changes on mobile)
- 0-RTT connection establishment
Common Misconceptions
- "HTTP/2 requires code changes" — HTTP/2 is a transport-level protocol. Your application code, API endpoints, and headers work unchanged. Only the server and client negotiate the protocol.
- "Domain sharding still helps with HTTP/2" — Domain sharding (distributing resources across multiple domains) was an HTTP/1.1 optimization. With HTTP/2, it prevents multiplexing and forces additional TCP connections.
- "HTTP/2 is always faster" — For single-request APIs, HTTP/2 has negligible benefit over HTTP/1.1. The gains are largest for web pages loading many resources in parallel.
- "Server push is essential for HTTP/2" — Server push is rarely used in practice. 103 Early Hints is a simpler alternative that achieves similar results.
- "HTTP/2 eliminates all head-of-line blocking" — It eliminates HTTP-level blocking but TCP-level blocking remains. A single lost TCP packet stalls all multiplexed streams until it is retransmitted.
How This Appears in Interviews
- "How does HTTP/2 improve performance over HTTP/1.1?" — Multiplexing (concurrent requests on one connection), header compression, stream prioritization, server push.
- "Why does gRPC use HTTP/2?" — Multiplexing enables concurrent RPCs on one connection. Binary framing matches protobuf's binary format. Stream support enables bidirectional streaming.
- "Design a real-time dashboard" — HTTP/2 server push or Server-Sent Events over HTTP/2 for live updates.
- "What is head-of-line blocking?" — HTTP/1.1: one request blocks the next on the same connection. HTTP/2 solves it at HTTP level but TCP-level blocking remains.
Related Concepts
- TCP Three-Way Handshake — HTTP/2 reduces the number of TCP connections needed
- TLS/SSL Handshake — HTTP/2 requires HTTPS in browsers
- WebSocket Protocol — bidirectional alternative to HTTP/2 server push
- Server-Sent Events — unidirectional server-to-client streaming
- Long Polling — HTTP/1.1-era technique replaced by HTTP/2 and SSE
- System Design Interview Guide
- Algoroq Pricing — access all concept deep-dives
GO DEEPER
Learn from senior engineers in our 12-week cohort
Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.