Blog 9 — Cloudflare: How We Handle 100B DNS Queries/Day

C

Qubits of DPK

March 21, 2026

Core Case Studies
Core Concept: Anycast routing, DDoS prevention, Edge computing, Rate limiting
Why SDE-2 Critical: Rate limiter design is asked in 80% of system design rounds
Status: Draft notes ready

Quick Revision

  • Problem: Serve global traffic fast while absorbing abuse and spikes.
  • Core pattern: Anycast plus edge rate limiting and layered DDoS defense.
  • Interview one-liner: Put decision-making close to the user and spread attack traffic across the network.

️ Architecture Overview

javascript
QUBITS OF DPK
1User's browser: "What is the IP of google.com?"
234Anycast Network → routes to nearest Cloudflare PoP
5(250+ Points of Presence globally)
678Cloudflare Edge Server
9  ├── Check cache → return if hit
10  ├── Rate limit check → block if exceeded
11  └── Forward to origin if cache miss

Core Concepts

Anycast Routing

javascript
QUBITS OF DPK
1Unicast (normal): Each server has unique IP
2  User in MumbaiDNS query → routes to specific server in US
3High latency 👎
4
5Anycast: Multiple servers share SAME IP
6  1.1.1.1 is announced by 250+ Cloudflare PoPs
7  User in MumbaiDNS query to 1.1.1.1
8BGP routing → nearest PoP responds (Mumbai PoP)
9Low latency 👍
10
11Also: DDoS attack on 1.1.1.1 → traffic spread across ALL PoPs
12No single server overwhelmed

DDoS Prevention at Scale

javascript
QUBITS OF DPK
1100B DNS queries/day = 1.16M queries/second (average)
2DDoS spikes: 10-50x normal traffic
3
4Defense layers:
5  Layer 1: Anycast absorbs flood across 250+ PoPs
6  Layer 2: Rate limiting per IP/subnet
7  Layer 3: Pattern detection (amplification attacks)
8  Layer 4: BGP blackholing (null-route attack source)
9  Layer 5: Traffic scrubbing (clean traffic forwarded, junk dropped)

Rate Limiting — Token Bucket Algorithm

javascript
QUBITS OF DPK
1Token Bucket:
2  Bucket capacity: 100 tokens
3  Refill rate: 10 tokens/second
4
5  Request arrives → consume 1 token
6  If tokens > 0 → allow request
7  If tokens = 0 → reject request (rate limited)
8
9Example:
10  Normal user: 5 req/sec → bucket never empties → allowed
11  DDoS bot:    10000 req/sec → bucket empties instantly → blocked

Sliding Window Rate Limiting (More Accurate)

javascript
QUBITS OF DPK
1Fixed Window problem:
2  Window: 10:00:00 - 10:01:00, limit = 100
3  99 requests at 10:00:59
4  99 requests at 10:01:01
5198 requests in 2 seconds — window boundary exploit!
6
7Sliding Window fix:
8  At any point in time, count requests in last 60 seconds
9  If > 100 → rate limited
10  No boundary exploit possible

Edge Computing

javascript
QUBITS OF DPK
1Traditional:
2  User[internet]Origin Server (US) → response
3  Latency: 200ms+ for users far from origin
4
5Edge Computing:
6  UserNearest PoP (runs computation) → response
7  Latency: 10-20ms
8
9Cloudflare Workers: Run JavaScript AT the edge PoP
10Authentication, A/B testing, personalization
11All without hitting origin server

Scale Achieved

5 Interview Questions This Blog Unlocks

Q1. Design a rate limiter

Answer: Use Token Bucket or Sliding Window Counter. Store state in Redis (fast, distributed). Key = userId or IP. Token Bucket: Redis stores tokens + last_refill_time. Each request atomically decrements tokens. If 0, reject. Sliding Window: Redis sorted set with timestamps, count entries in last N seconds.

Q2. What is Anycast routing and why does Cloudflare use it?

Answer: Multiple servers announce the same IP. BGP routing sends each user to the nearest server geographically. Benefits: low latency (nearest server responds), DDoS absorption (attack spread across all PoPs), automatic failover (if one PoP dies, users route to next nearest).

Q3. What is the difference between Token Bucket and Leaky Bucket?

Answer: Token Bucket: allows bursts up to bucket capacity. Tokens accumulate when traffic is low. Burst traffic allowed until tokens empty. Leaky Bucket: processes requests at constant rate regardless of burst. Queue absorbs bursts, processes steadily. Token Bucket = bursty OK. Leaky Bucket = smooth output.

Q4. How would you implement a distributed rate limiter across multiple servers?

Answer: Centralized Redis with atomic Lua scripts. All app servers check Redis for the rate limit state. Redis INCR with TTL for fixed window. Redis sorted sets for sliding window. Lua script ensures check-and-update is atomic (no race conditions). Redis Cluster for high availability.

Q5. What is a DNS amplification attack and how does Cloudflare prevent it?

Answer: Attacker sends small DNS query with spoofed source IP (victim's IP). DNS server sends large response to victim. 1 byte query → 100 byte response = 100x amplification. Cloudflare prevents this with: response rate limiting, rejecting queries for non-existent domains, BCP38 filtering at network edges.

Key Engineering Lessons