INTERVIEW_QUESTIONS
HTTP/HTTPS Interview Questions for Senior Engineers (2026)
Master HTTP/HTTPS interview questions covering HTTP methods, status codes, HTTP/2, HTTP/3, REST APIs, caching headers, and security for senior engineering interviews.
HTTP/HTTPS Interview Questions for Senior Engineers (2026)
HTTP is the foundation of the modern web, and deep understanding of the protocol separates senior engineers from junior ones. Interviews at top tech companies expect you to reason about HTTP semantics, performance optimizations, security implications, and protocol evolution from HTTP/1.1 through HTTP/3.
These questions frequently appear in API design discussions, system design interviews, and backend engineering rounds. A senior engineer should be able to explain not just what HTTP features exist, but why they were designed that way and how they affect real-world systems.
This guide covers 15 critical HTTP/HTTPS interview questions with structured answer frameworks, follow-up questions, and practical examples drawn from production systems.
Questions
1. Explain the evolution from HTTP/1.1 to HTTP/2 to HTTP/3 and the problems each version solved.
What the interviewer is really asking: Do you understand why the protocol evolved and the architectural trade-offs at each step?
Answer framework:
HTTP/1.1 (1997):
- Text-based protocol with request-response pairs.
- Keep-alive connections to avoid TCP handshake per request.
- Problems: Head-of-line (HOL) blocking (one slow response blocks subsequent requests on the same connection), no multiplexing, verbose headers repeated on every request, no server push.
- Workarounds: Domain sharding (multiple connections to different hostnames), sprite sheets, concatenated JS/CSS, inlining small resources.
HTTP/2 (2015):
- Binary framing layer: requests and responses are split into frames.
- Multiplexing: multiple concurrent streams over a single TCP connection. Eliminates HTTP-level HOL blocking.
- Header compression (HPACK): headers are compressed with a shared dictionary, reducing overhead by 85-90%.
- Server push: server can proactively send resources before the client requests them.
- Stream prioritization: client can indicate which resources are more important.
- Problems solved: HTTP-level HOL blocking, header overhead, multiple connections.
- Remaining problem: TCP-level HOL blocking. A single lost TCP packet blocks all multiplexed streams until retransmission.
HTTP/3 (2022):
- Replaces TCP with QUIC (UDP-based transport).
- Eliminates TCP-level HOL blocking: each stream has independent loss recovery.
- Faster connection establishment: 1-RTT handshake (combines transport and TLS), 0-RTT for resumption.
- Connection migration: QUIC connections are identified by connection ID, not IP:port tuple, so connections survive network changes (WiFi to cellular).
- Improved loss recovery: QUIC has more precise RTT measurements and better congestion control.
When to use each:
- HTTP/1.1: Legacy systems, simple APIs, environments where HTTP/2 proxies are unavailable.
- HTTP/2: Default for most web applications. Eliminates need for concatenation and domain sharding.
- HTTP/3: Mobile users (connection migration), high-latency networks, lossy networks (independent stream recovery).
Follow-up questions:
- Why did HTTP/2 server push fail in practice and get removed from Chrome?
- How does HPACK header compression work and what is QPACK?
- What are the deployment challenges of HTTP/3 since it runs over UDP?
2. What are the differences between PUT, PATCH, and POST, and when should you use each?
What the interviewer is really asking: Do you understand HTTP method semantics for API design?
Answer framework:
POST: Creates a new resource. Not idempotent — calling it twice creates two resources. The server determines the resource URL. POST /users creates a new user and returns 201 Created with a Location header pointing to /users/123.
PUT: Replaces a resource entirely. Idempotent — calling it twice with the same data produces the same result. The client specifies the resource URL. PUT /users/123 replaces the entire user object. If the resource does not exist, PUT may create it (depending on API design).
PATCH: Partially updates a resource. May or may not be idempotent (depends on patch format). PATCH /users/123 with {"email": "new@example.com"} updates only the email, leaving other fields unchanged.
Key distinctions:
-
Idempotency: PUT is idempotent by specification. POST is not. PATCH may be (JSON Merge Patch is idempotent, JSON Patch is not necessarily). This matters for retry safety — retrying a PUT is always safe, retrying a POST may create duplicates.
-
Complete vs partial: PUT requires the complete resource representation. Omitting a field in PUT means setting it to null (or default). PATCH only affects specified fields.
-
Practical implications for APIs: For updating a user profile where users change one field at a time, PATCH is appropriate. For uploading a configuration file that must be completely replaced, PUT is appropriate. For creating orders, POST is appropriate because the server assigns the order ID.
Common API design mistake: Using POST for everything. This loses the semantic benefits of HTTP methods — caches, proxies, and intermediaries understand method semantics and can make decisions accordingly (e.g., caches know GET responses are cacheable, PUT is idempotent for retry).
For API design best practices in system design interviews, method selection is a signal of design maturity.
Follow-up questions:
- What is the difference between JSON Merge Patch and JSON Patch?
- How does method idempotency affect retry strategies in distributed systems?
- When would you use DELETE with a request body?
3. Explain HTTP caching headers and how browsers and CDNs use them.
What the interviewer is really asking: Can you design an effective caching strategy using HTTP semantics?
Answer framework:
Cache-Control (most important):
public: Any cache (browser, CDN, proxy) can cache.private: Only the browser can cache (not CDN). Use for user-specific responses.no-cache: Cache may store but must revalidate with origin before serving. Misleading name — it does not mean "don't cache."no-store: Do not cache at all. Use for sensitive data (banking, healthcare).max-age=N: Cache for N seconds from response time.s-maxage=N: Like max-age but only for shared caches (CDNs). Overrides max-age for CDNs while allowing different browser caching.stale-while-revalidate=N: Serve stale for N seconds while revalidating in background.stale-if-error=N: Serve stale for N seconds if origin returns an error.immutable: Content will never change. Browser skips revalidation even on reload.
ETag and Last-Modified (conditional requests):
- Server sends
ETag: "abc123"orLast-Modified: Wed, 09 Apr 2026 10:00:00 GMT. - On revalidation, browser sends
If-None-Match: "abc123"orIf-Modified-Since. - Server responds with
304 Not Modified(no body) if unchanged, or200 OKwith new content. - Saves bandwidth but still requires an origin round trip.
Vary header:
- Tells caches which request headers affect the response.
Vary: Accept-Encodingmeans the response differs based on whether the client accepts gzip. Vary: Cookieis almost always wrong — it creates a cache entry per unique cookie string.
Practical caching strategy:
| Content Type | Cache-Control | Why |
|---|---|---|
| Versioned static assets | public, max-age=31536000, immutable | URL changes when content changes |
| HTML pages | public, max-age=0, must-revalidate or no-cache | Must always check for updates |
| API responses | private, max-age=60 or no-store | User-specific or time-sensitive |
| Images (not versioned) | public, max-age=86400, stale-while-revalidate=604800 | Balance freshness with performance |
For CDN-specific caching, see our CDN interview questions.
Follow-up questions:
- What is the difference between
no-cacheandno-store? - How does the Vary header affect CDN cache hit ratios?
- How would you implement cache busting for a single-page application?
4. What are CORS issues and how does the CORS mechanism work?
What the interviewer is really asking: Do you understand browser security models and how they affect API design?
Answer framework:
CORS (Cross-Origin Resource Sharing) is a browser security mechanism that controls which origins can make requests to your API. It extends the Same-Origin Policy, which by default blocks cross-origin requests.
Same-Origin Policy: Two URLs have the same origin if they share protocol, hostname, and port. https://app.example.com and https://api.example.com are different origins.
Simple requests (GET, HEAD, POST with simple headers and content types) are sent directly. The browser checks the response for Access-Control-Allow-Origin and blocks the response if it does not match.
Preflight requests (for non-simple requests — PUT, DELETE, custom headers, JSON content type):
- Browser sends an OPTIONS request with
Origin,Access-Control-Request-Method, andAccess-Control-Request-Headers. - Server responds with
Access-Control-Allow-Origin,Access-Control-Allow-Methods,Access-Control-Allow-Headers, andAccess-Control-Max-Age. - If the preflight passes, the browser sends the actual request.
Access-Control-Max-Agecaches the preflight result, avoiding repeated OPTIONS requests.
Common CORS headers:
Access-Control-Allow-Origin: https://app.example.com(specific origin) or*(any origin, but cannot be used with credentials).Access-Control-Allow-Credentials: true(allow cookies and auth headers).Access-Control-Allow-Methods: GET, POST, PUT, DELETE.Access-Control-Allow-Headers: Content-Type, Authorization.Access-Control-Expose-Headers: X-Request-ID(which response headers the browser can access).*
Common mistakes:
- Setting
Access-Control-Allow-Origin: *withAccess-Control-Allow-Credentials: true— this is explicitly forbidden by the spec. - Reflecting the Origin header value without validation — this makes CORS useless as a security mechanism.
- Not handling OPTIONS requests, causing 405 errors for preflight requests.
- Not setting
Access-Control-Max-Age, causing a preflight for every non-simple request.*
Follow-up questions:
- Why can you not use
*for Allow-Origin when credentials are included? - How do CORS policies interact with CDN caching?
- What is the security risk of reflecting the Origin header in Allow-Origin?*
5. Explain HTTP status codes and common misuses in API design.
What the interviewer is really asking: Do you design APIs with proper HTTP semantics?
Answer framework:
1xx Informational:
100 Continue: Server acknowledges the request headers. Used for large uploads — client sendsExpect: 100-continue, server responds with 100 before the client sends the body.101 Switching Protocols: Used for WebSocket upgrade.
2xx Success:
200 OK: Standard success response.201 Created: Resource created. IncludeLocationheader with the new resource URL.202 Accepted: Request accepted for async processing (not yet completed). Return a URL to check status.204 No Content: Success with no response body. Use for DELETE or PUT that doesn't return the updated resource.
3xx Redirection:
301 Moved Permanently: Permanent redirect. Browsers cache this aggressively. Search engines transfer SEO ranking.302 Found/307 Temporary Redirect: Temporary redirect. 307 preserves the HTTP method (POST stays POST); 302 may change to GET.304 Not Modified: Conditional request — content has not changed since the cached version.
4xx Client Error:
400 Bad Request: Malformed request (invalid JSON, missing required fields).401 Unauthorized: Authentication required or failed. Misleadingly named — it means "unauthenticated."403 Forbidden: Authenticated but not authorized. The server understood the request but refuses.404 Not Found: Resource does not exist.409 Conflict: Request conflicts with current state (e.g., trying to create a resource that already exists).422 Unprocessable Entity: Request is syntactically valid but semantically wrong (validation errors).429 Too Many Requests: Rate limited. IncludeRetry-Afterheader.
5xx Server Error:
500 Internal Server Error: Unhandled server error.502 Bad Gateway: Upstream server returned an invalid response.503 Service Unavailable: Server temporarily overloaded. IncludeRetry-After.504 Gateway Timeout: Upstream server did not respond in time.
Common misuses:
- Returning
200 OKwith{"error": "not found"}in the body. Use proper status codes. - Using
404for authorization failures (should be403). - Using
500for all errors including validation failures (should be400or422). - Not distinguishing
401(need to log in) from403(logged in but not allowed).
For API design in system design interviews, proper status code usage demonstrates protocol literacy.
Follow-up questions:
- What is the difference between
400and422? - When should you use
202 Acceptedand what should the response body contain? - How do load balancers and CDNs interpret different status codes for retry logic?
6. How do WebSockets work and when should you use them instead of HTTP?
What the interviewer is really asking: Do you understand bidirectional communication protocols and their trade-offs?
Answer framework:
WebSocket provides full-duplex, bidirectional communication over a single TCP connection. Unlike HTTP's request-response model, either side can send messages at any time.
Connection establishment:
- Client sends an HTTP/1.1 GET request with
Upgrade: websocketandConnection: Upgradeheaders, plusSec-WebSocket-Key(random base64 value). - Server responds with
101 Switching ProtocolsandSec-WebSocket-Accept(SHA-1 hash of client's key + magic GUID). - The TCP connection is now a WebSocket connection. Both sides can send frames.
Frame format:
- Lightweight binary framing: 2-14 bytes overhead per frame (compared to HTTP headers of hundreds of bytes).
- Supports text (UTF-8) and binary frames.
- Ping/pong frames for connection health checking.
- Close frames for graceful shutdown.
When to use WebSockets:
- Real-time, bidirectional communication: chat applications, collaborative editing, multiplayer games.
- High-frequency updates: stock tickers, live sports scores, monitoring dashboards.
- Server-initiated messages: push notifications, live comments, typing indicators.
When NOT to use WebSockets:
- Simple request-response APIs — use HTTP.
- Infrequent updates — use long polling or Server-Sent Events (SSE).
- When you need HTTP caching, CDN support, or standard load balancing.
Alternatives comparison:
- SSE (Server-Sent Events): Server-to-client only, over standard HTTP. Simple, auto-reconnect, works with CDNs. Best for unidirectional real-time (news feeds, notifications).
- Long polling: Client makes an HTTP request; server holds it open until data is available. Simpler infrastructure but higher overhead. Best for compatibility with existing HTTP infrastructure.
- HTTP/2 Server Push: Deprecated in Chrome. Was designed for assets, not real-time data.
Infrastructure challenges with WebSockets:
- Load balancers must support WebSocket upgrade (L7 load balancers with WebSocket awareness).
- Sticky sessions may be needed if WebSocket state is server-local.
- Connection limits: each WebSocket is a persistent TCP connection consuming server resources.
- Scaling: a server handling 100K WebSocket connections needs different architecture than one handling 100K HTTP requests.
For designing chat systems and real-time features, WebSocket knowledge is essential.
Follow-up questions:
- How does WebSocket work with HTTP/2 (RFC 8441)?
- How would you scale a WebSocket server to handle 1 million concurrent connections?
- What is the reconnection strategy for WebSocket clients?
7. Explain how HTTPS works and the role of certificates.
What the interviewer is really asking: Do you understand the trust model of the web PKI?
Answer framework:
HTTPS = HTTP over TLS. TLS provides three security properties:
- Confidentiality: Data is encrypted. Eavesdroppers cannot read the content.
- Integrity: Data cannot be modified in transit without detection.
- Authentication: The server is verified to be who it claims to be.
Certificate chain of trust:
- Root CAs (Certificate Authorities) are pre-installed in browsers and operating systems (~100-150 roots).
- Root CAs sign intermediate CA certificates.
- Intermediate CAs sign server certificates.
- When a browser connects to a server, it verifies: the server certificate is signed by a trusted intermediate, which is signed by a trusted root, forming a chain of trust.
Certificate contents:
- Subject: domain name(s) the certificate covers (CN and SAN fields).
- Issuer: the CA that signed the certificate.
- Validity period: not-before and not-after dates.
- Public key: the server's RSA or ECDSA public key.
- Signature: the issuer's signature over the certificate.
Certificate validation:
- Chain validation: verify each certificate's signature up to a trusted root.
- Hostname matching: verify the requested hostname matches the certificate's SAN.
- Expiration check: verify the certificate is within its validity period.
- Revocation check: check CRL or OCSP to ensure the certificate has not been revoked.
- Certificate Transparency: verify the certificate is logged in public CT logs.
Modern certificate management:
- Let's Encrypt: free, automated certificates with 90-day validity.
- ACME protocol: automated certificate issuance and renewal.
- Short-lived certificates reduce the window of compromise.
- Certificate pinning (HPKP) is deprecated — too risky (can permanently lock users out if you lose the pinned key).
For TLS performance considerations, see our networking interview questions. For authentication beyond TLS, see OAuth interview questions.
Follow-up questions:
- What happens if a root CA is compromised?
- How does Certificate Transparency prevent misissued certificates?
- What is the difference between DV, OV, and EV certificates?
8. How do cookies work and what are the security implications?
What the interviewer is really asking: Do you understand web state management and its security model?
Answer framework:
Cookies are small pieces of data (4KB limit) that the server sets via Set-Cookie headers and the browser automatically includes in subsequent requests to the same domain.
Cookie attributes:
Domain: Which domain the cookie is sent to.Domain=.example.comincludes all subdomains.Path: URL path prefix for which the cookie is sent.Expires/Max-Age: When the cookie expires. Without this, it is a session cookie (deleted when the browser closes).Secure: Only sent over HTTPS. Always use for sensitive cookies.HttpOnly: Not accessible to JavaScript (document.cookie). Prevents XSS from stealing cookies.SameSite: Controls cross-site sending.Strict: Never sent cross-site. Best security but breaks legitimate cross-site links.Lax(default in modern browsers): Sent with top-level navigations (clicking a link) but not with cross-site subrequests (images, iframes, AJAX). Good balance.None: Sent cross-site. RequiresSecure. Used for third-party cookies (tracking, embedded widgets).
Security implications:
-
Session hijacking: If a session cookie is stolen (via XSS, network sniffing, or access to the browser), the attacker can impersonate the user. Mitigations: HttpOnly, Secure, short session lifetimes, session binding to IP/user-agent.
-
CSRF (Cross-Site Request Forgery): A malicious site triggers a request to your site, and the browser automatically includes cookies. With
SameSite=Lax, most CSRF attacks are mitigated. Additional protection: CSRF tokens, checking Origin/Referer headers. -
Third-party cookie deprecation: Browsers are phasing out third-party cookies for privacy. This affects analytics, advertising, and SSO implementations. Alternatives: first-party cookies with server-side tracking, Privacy Sandbox APIs.
Cookies vs alternatives:
- localStorage/sessionStorage: Larger (5-10MB), JavaScript-only, no automatic HTTP sending. Vulnerable to XSS.
- Bearer tokens in headers: More explicit, works for APIs, avoids CSRF entirely. But requires JavaScript to attach, so no protection from XSS.
- Server-side sessions: Store session data server-side with only a session ID in the cookie. More secure but requires session storage infrastructure.
Follow-up questions:
- How does SameSite=Lax differ from SameSite=Strict in practice?
- What is the impact of third-party cookie deprecation on SSO?
- How would you implement a secure session management system?
9. What is content negotiation and how does it work in HTTP?
What the interviewer is really asking: Do you understand HTTP's built-in mechanisms for serving different representations of the same resource?
Answer framework:
Content negotiation allows the server to serve different representations of a resource based on client capabilities and preferences.
Proactive (server-driven) negotiation: The client sends preferences in request headers, and the server selects the best representation:
Accept: application/json, text/html;q=0.9— prefers JSON, accepts HTML with lower priority.Accept-Language: en-US, en;q=0.9, fr;q=0.5— prefers US English.Accept-Encoding: gzip, br, zstd— supported compression algorithms.Accept: image/avif, image/webp, image/png— image format preferences.
The q (quality) value ranges from 0 to 1 (default 1) indicating preference weight.
Server response:
- Serves the best matching representation.
- Sets
Varyheader to indicate which request headers influenced the response (critical for caching). - May include
Content-Type,Content-Language,Content-Encodingto indicate the chosen representation.
Reactive (agent-driven) negotiation:
Server responds with 300 Multiple Choices and a list of available representations. The client selects one. Less common in practice.
Practical applications:
- API versioning via Accept header:
Accept: application/vnd.api+json;version=2. Cleaner than URL versioning but harder to cache and test. - Image optimization: Serve AVIF to supporting browsers, WebP as fallback, JPEG for legacy. CDNs often handle this automatically.
- Internationalization: Serve content in the user's preferred language.
- Compression: Serve Brotli-compressed content to supporting browsers (40% smaller than gzip for text).
Caching implications:
Content negotiation interacts with caching through the Vary header. Vary: Accept-Encoding means caches must store separate entries for each encoding. Too many Vary headers fragment the cache.
Follow-up questions:
- How does content negotiation interact with CDN caching?
- What are the trade-offs of API versioning via headers vs URL paths?
- How do browsers determine which image formats they support?
10. What is HTTP/2 server push and why was it largely abandoned?
What the interviewer is really asking: Can you analyze a protocol feature critically, understanding why a good idea in theory failed in practice?
Answer framework:
The concept: HTTP/2 server push allowed servers to proactively send resources to the client before the client requested them. When the server knew the client would need style.css after receiving index.html, it could push style.css immediately, eliminating the round trip.
Why it failed:
-
Cache invalidation problem: The server does not know what the browser already has cached. Pushing a resource the browser already has wastes bandwidth. The CACHE_DIGEST proposal (letting browsers inform servers of their cache contents) never gained adoption.
-
Priority conflicts: Pushed resources competed for bandwidth with the resources the browser actually requested. A pushed low-priority resource could delay a high-priority resource the browser was actively waiting for.
-
Complexity for marginal benefit: Implementing push correctly required understanding the client's cache state, resource priorities, and network conditions. The benefit (saving one round trip) was often negated by the cost (wasted bandwidth, priority inversion).
-
103 Early Hints as an alternative:
103 Early Hintslets the server sendLink: rel=preloadheaders before the final response. The browser then fetches the resources itself, respecting its own cache and priorities. This achieves similar latency benefits without the downsides of push. -
CDN implementation challenges: Intermediate proxies and CDNs had difficulty implementing push correctly. Many stripped push promises.
Chrome removed HTTP/2 push support in 2022. The industry converged on 103 Early Hints + <link rel=preload> as the preferred approach.
Lesson for interviews: This is a great example of how a theoretically sound optimization can fail due to practical considerations. It demonstrates the importance of understanding the full system (browser caches, proxy behavior, real network conditions) rather than optimizing in isolation.
Follow-up questions:
- How does 103 Early Hints work and how is it different from server push?
- What is the preload scanner in browsers and how does it relate to these optimizations?
- Can you think of a scenario where server push would still be beneficial?
11. How would you design rate limiting for an HTTP API?
What the interviewer is really asking: Can you implement a practical, production-ready rate limiting system using HTTP semantics?
Answer framework:
Rate limiting protects APIs from abuse and ensures fair resource allocation. A well-designed rate limiter uses HTTP standards to communicate limits clearly.
HTTP response headers for rate limiting:
X-RateLimit-Limit: 100— maximum requests per window.X-RateLimit-Remaining: 42— requests remaining in current window.X-RateLimit-Reset: 1713484800— Unix timestamp when the window resets.Retry-After: 30— seconds to wait before retrying (sent with 429 responses).
Rate limiting strategies:
-
Fixed window: Count requests per calendar window (per minute/hour). Simple but allows burst at window boundaries (2x burst at the seam of two windows).
-
Sliding window log: Store timestamp of each request. Count requests in the past N seconds. Precise but memory-intensive.
-
Sliding window counter: Weighted combination of current and previous window counts. Good balance of precision and efficiency.
-
Token bucket: Bucket holds N tokens, refilled at rate R. Each request consumes a token. Allows bursts up to bucket size while maintaining average rate. Industry standard (used by Stripe, AWS).
-
Leaky bucket: Requests enter a FIFO queue processed at a fixed rate. Smooths traffic completely but adds queuing latency.
Implementation considerations:
- What to rate limit by: API key, user ID, IP address, or a combination. IP-based is problematic behind NAT (many users share an IP).
- Distributed rate limiting: Use Redis with atomic INCR + EXPIRE. For multi-region, either centralized Redis (adds latency) or local counters with eventual sync (allows temporary overshoot).
- Differentiated limits: Different limits for different API endpoints, user tiers, or request types.
- Graceful degradation: Return 429 with Retry-After rather than silently dropping requests.
See our rate limiter system design for a complete architecture discussion.
Follow-up questions:
- How do you handle rate limiting in a multi-region deployment?
- What is the trade-off between token bucket and sliding window approaches?
- How should clients handle 429 responses? (exponential backoff with jitter)
12. Explain HTTP request smuggling and how to prevent it.
What the interviewer is really asking: Do you understand subtle HTTP parsing vulnerabilities that affect production systems?
Answer framework:
HTTP request smuggling exploits discrepancies in how front-end servers (reverse proxies, load balancers, CDNs) and back-end servers parse HTTP request boundaries.
The vulnerability: HTTP/1.1 has two ways to determine request length:
Content-Length: 42— body is exactly 42 bytes.Transfer-Encoding: chunked— body is sent in chunks, terminated by a zero-length chunk.
When a request contains both headers and the front-end and back-end servers disagree on which to use, the boundary between requests shifts. An attacker can embed a second request inside the first request's body.
Attack variants:
-
CL.TE (front-end uses Content-Length, back-end uses Transfer-Encoding): Front-end forwards based on Content-Length, but back-end parses chunks and sees part of the body as the start of a new request.
-
TE.CL (front-end uses Transfer-Encoding, back-end uses Content-Length): Front-end forwards based on chunks, but back-end uses Content-Length and treats remaining data as a new request.
-
TE.TE (different Transfer-Encoding parsing): Both use Transfer-Encoding but disagree on whether obfuscated values (e.g.,
Transfer-Encoding: xchunked) are valid.
Impact:
- Bypass security controls (WAF, authentication) applied by the front-end.
- Poison web caches (cache a malicious response for another user's request).
- Hijack other users' requests (prepend attacker-controlled data).
Prevention:
- Use HTTP/2 end-to-end (binary framing eliminates ambiguity in request boundaries).
- Normalize requests at the front-end: reject requests with both Content-Length and Transfer-Encoding.
- Ensure front-end and back-end agree on request parsing (use the same HTTP library if possible).
- Disable HTTP/1.1 connection reuse between front-end and back-end (eliminates smuggling but costs performance).
- Use the same server software for both tiers, or test for parsing discrepancies.
Follow-up questions:
- How does HTTP/2 eliminate request smuggling?
- What is HTTP/2 downgrade smuggling and how does it work?
- How would you test for request smuggling vulnerabilities in your infrastructure?
13. How do you design an API for backward compatibility?
What the interviewer is really asking: Do you think about API evolution and the long-term maintainability of HTTP APIs?
Answer framework:
Backward compatibility ensures existing clients continue to work when the API evolves. Breaking changes force all clients to update simultaneously — impractical for public APIs.
Versioning strategies:
-
URL path versioning (
/v1/users,/v2/users): Most common. Easy to understand, route, and cache. Downside: implies a completely new API for each version. -
Header versioning (
Accept: application/vnd.api+json;version=2): Clean URLs, supports content negotiation. Harder to test (cannot just change the URL in a browser). -
Query parameter versioning (
/users?version=2): Simple to use. Can complicate caching. -
No explicit versioning (evolution): Add new fields and endpoints without removing old ones. Most flexible but requires discipline.
Compatibility rules (additive changes are safe):
- Adding new fields to response objects: safe (clients should ignore unknown fields).
- Adding new optional query parameters: safe.
- Adding new endpoints: safe.
- Adding new enum values: potentially breaking if clients use strict parsing.
Breaking changes (require versioning):
- Removing or renaming fields.
- Changing field types (string to integer).
- Changing the meaning of existing fields.
- Adding required parameters.
- Changing error response format.
Migration strategies:
- Run multiple versions simultaneously during transition period.
- Set deprecation timeline with sunset headers (
Sunset: Sat, 01 Jan 2027 00:00:00 GMT). - Monitor old version usage and communicate with consumers.
- Provide migration guides and tooling.
Follow-up questions:
- How does GraphQL handle API versioning differently from REST?
- What is the Robustness Principle and how does it apply to API design?
- How would you deprecate an API endpoint used by thousands of clients?
14. What is HTTP streaming and when should you use it?
What the interviewer is really asking: Do you know techniques for sending large or real-time data over HTTP?
Answer framework:
HTTP streaming allows sending data incrementally rather than waiting for the complete response. Several mechanisms exist:
Transfer-Encoding: chunked:
- Response is sent in chunks without knowing the total size upfront.
- Each chunk is prefixed with its size in hexadecimal.
- A zero-length chunk terminates the response.
- Use case: dynamically generated content (database query results, log streams).
Server-Sent Events (SSE):
- Standard HTTP response with
Content-Type: text/event-stream. - Server sends events in a simple text format:
data: {"msg": "hello"}\n\n. - Browser auto-reconnects on disconnect (with
Last-Event-IDfor resumption). - One-directional: server to client only.
- Use cases: live notifications, stock tickers, real-time dashboards.
- Advantages over WebSocket: works through HTTP proxies and CDNs, simpler protocol, auto-reconnect.
HTTP/2 streaming:
- Response body frames can be sent incrementally on a stream.
- Multiplexed with other streams on the same connection.
- Better for structured streaming APIs.
gRPC streaming:
- Built on HTTP/2 streams.
- Supports unary, server-streaming, client-streaming, and bidirectional streaming.
- Strongly typed with Protocol Buffers.
- Use case: microservice communication with streaming data.
When to use HTTP streaming:
- Large responses that take time to generate (export CSV of millions of rows).
- Real-time server-to-client updates (SSE is simpler than WebSocket for this).
- AI/LLM responses (stream tokens as they are generated, as used by ChatGPT).
- Log tailing and monitoring dashboards.
When NOT to use:
- When clients need random access to the data (use pagination instead).
- When intermediaries (proxies, CDNs) buffer the entire response.
- When response size is small and latency to generate is negligible.
Follow-up questions:
- How do load balancers handle long-lived HTTP streaming connections?
- What is the difference between HTTP streaming and WebSocket for real-time data?
- How would you implement resumable streaming for a large data export?
15. How does HTTP/3 (QUIC) change the networking landscape?
What the interviewer is really asking: Are you current with protocol evolution and can you reason about its implications?
Answer framework:
HTTP/3 replaces TCP with QUIC (Quick UDP Internet Connections), a new transport protocol built on UDP. This is the most significant change to the web transport stack since TCP.
Key architectural changes:
-
UDP-based: QUIC runs over UDP. This is not raw UDP — QUIC implements reliability, ordering, flow control, and congestion control. Using UDP allows deployment without waiting for middlebox (NAT, firewall) updates that would block a new IP protocol.
-
Integrated TLS 1.3: Encryption is not optional — it is built into QUIC. The handshake combines transport and TLS setup in a single round trip. This means all QUIC traffic is encrypted, including headers that were plaintext in TCP.
-
Independent streams: Unlike TCP where a single lost packet blocks all multiplexed HTTP/2 streams, QUIC provides independent stream loss recovery. Stream A's lost packet does not block Stream B.
-
Connection migration: QUIC connections are identified by a Connection ID, not the IP/port 4-tuple. When a mobile user switches from WiFi to cellular, the connection survives — the Connection ID remains the same even though the IP address changes.
-
Improved loss recovery: QUIC uses monotonically increasing packet numbers (never reused, unlike TCP sequence numbers), making loss detection unambiguous. Better RTT estimation leads to more responsive congestion control.
Practical impact:
- Mobile users: fewer connection interruptions, faster recovery from network changes.
- Lossy networks: independent stream recovery means a single packet loss does not stall the entire page load.
- High-latency networks: 1-RTT connection setup (vs TCP's 1-RTT + TLS 1.3's 1-RTT = 2-RTT).
Deployment challenges:
- UDP is sometimes blocked or rate-limited by enterprise firewalls and ISPs.
- CPU cost: QUIC is currently more CPU-intensive than TCP (kernel vs userspace processing).
- Observability: traditional network tools (tcpdump, Wireshark) have limited QUIC support since most of the packet is encrypted.
- Middlebox interference: NATs, firewalls, and load balancers may need updates to handle QUIC properly.
Adoption:
- Google has been using QUIC since 2013. Chrome and Firefox support HTTP/3.
- Major CDNs (Cloudflare, Fastly, Akamai) support HTTP/3.
- curl, nginx, and caddy support HTTP/3.
For networking fundamentals, see our networking interview questions. For CDN implications, see CDN interview questions.
Follow-up questions:
- Why is QUIC implemented in userspace rather than the kernel?
- How do you fall back gracefully when UDP is blocked?
- What is the CPU overhead of QUIC compared to TCP and how is it being addressed?
Common Mistakes
-
Treating HTTP as just GET and POST. Senior engineers should use the full vocabulary of methods, headers, and status codes to design expressive, cache-friendly APIs.
-
Not understanding caching headers. Improper Cache-Control headers either serve stale content or defeat caching entirely. Both are expensive mistakes at scale.
-
Ignoring HTTP security headers. CSP, HSTS, X-Content-Type-Options, and SameSite cookies are not optional for production applications.
-
Over-abstracting HTTP. Using only a framework's abstractions (routes, middleware) without understanding the underlying protocol limits your debugging ability and design judgment.
-
Confusing authentication with authorization. 401 means unauthenticated (who are you?), 403 means unauthorized (you cannot do this). Getting this wrong confuses API consumers.
How to Prepare
Week 1: Read the HTTP/1.1 and HTTP/2 RFCs' key sections. Understand method semantics, status codes, and caching headers.
Week 2: Study HTTP/3 and QUIC. Understand the problems they solve and deployment challenges.
Week 3: Practice API design — design RESTful APIs for real systems with proper methods, status codes, caching, and versioning.
Week 4: Study security — CORS, CSP, cookie security, request smuggling, and TLS.
For comprehensive preparation, see our system design interview guide and explore learning paths.
Related Resources
- System Design Interview Guide — Complete preparation framework
- Networking Interview Questions — TCP/UDP/DNS deep dive
- CDN Interview Questions — Caching and content delivery
- OAuth/Authentication Interview Questions — Identity and access
- Security Interview Questions — Application security
- Rate Limiter System Design — Rate limiting architecture
- Chat System Design — Real-time communication
- Distributed Systems Guide — Core distributed concepts
- Start Learning — Structured learning paths
- Pricing — Premium preparation resources
GO DEEPER
Master this topic in our 12-week cohort
Our Advanced System Design cohort covers this and 11 other deep-dive topics with live sessions, assignments, and expert feedback.