SQL 10 : Window Functions

D

Qubits of DPK

March 31, 2026

Core DBMS

Topics Covered

  • CTEs (Common Table Expressions)
  • Window Functions — Problem Statement
  • Window Functions Intro + OVER()
  • OVER + PARTITION BY (Deep Dive)
  • ROW_NUMBER()
  • RANK() & DENSE_RANK()
  • LEAD() and LAG()

Topic 1 — CTEs (Common Table Expressions)

What is a CTE?

A named temporary result set that exists only for the duration of a single query. Defined using the WITH clause.

Basic Syntax

sql
QUBITS OF DPK
1WITH cte_name AS (
2    SELECT ...
3)
4SELECT * FROM cte_name;

Multiple CTEs

sql
QUBITS OF DPK
1WITH
2frequent_customers AS (
3    SELECT customer_id, COUNT(*) AS total_orders
4    FROM orders
5    GROUP BY customer_id
6    HAVING COUNT(*) >= 3
7),
8high_spenders AS (
9    SELECT customer_id, SUM(total_amount) AS total_spent
10    FROM orders
11    GROUP BY customer_id
12    HAVING SUM(total_amount) >= 1000
13)
14SELECT c.customer_name, c.city, fc.total_orders, hs.total_spent
15FROM customers c
16JOIN frequent_customers fc ON c.customer_id = fc.customer_id
17JOIN high_spenders hs ON c.customer_id = hs.customer_id;

When You MUST Use CTE

sql
QUBITS OF DPK
1-- FAILS ❌
2SELECT *, DENSE_RANK() OVER (ORDER BY total_amount DESC) AS rnk
3FROM orders
4WHERE rnk <= 3;
5
6-- CORRECT ✅
7WITH ranked_orders AS (
8    SELECT *, DENSE_RANK() OVER (ORDER BY total_amount DESC) AS rnk
9    FROM orders
10)
11SELECT * FROM ranked_orders WHERE rnk <= 3;

CTE vs Subquery vs View

Recursive CTE

sql
QUBITS OF DPK
1WITH RECURSIVE employee_chain AS (
2    SELECT employee_id, employee_name, manager_id, 1 AS level
3    FROM employees
4    WHERE manager_id IS NULL
5
6    UNION ALL
7
8    SELECT e.employee_id, e.employee_name, e.manager_id, ec.level + 1
9    FROM employees e
10    JOIN employee_chain ec ON e.manager_id = ec.employee_id
11)
12SELECT * FROM employee_chain ORDER BY level;

Interview Traps ️

  • CTE is temporary — NOT a permanent object like a View
  • CTE can be referenced multiple times — subquery cannot
  • Window function results need CTE wrapper before WHERE filter
  • Recursive CTE requires UNION ALL not UNION

Topic 2 — Window Functions: The Problem Statement

The Core Problem

sql
QUBITS OF DPK
1GROUP BY (collapses rows):
2customer_id | total_revenue
31           | 1070   ← individual orders GONE
42           | 500
5
6Window Function (keeps all rows):
7order_id | customer_id | total_amount | customer_total
81        | 1           | 450          | 1070row preserved
96        | 1           | 620          | 1070row preserved
102        | 2           | 320          | 500
117        | 2           | 180          | 500

Categories of Window Functions

sql
QUBITS OF DPK
11. Aggregate → SUM(), AVG(), COUNT(), MAX(), MIN()
22. Ranking   → ROW_NUMBER(), RANK(), DENSE_RANK()
33. Value     → LAG(), LEAD(), FIRST_VALUE(), LAST_VALUE()

GROUP BY vs Window Functions

Topic 3 — Window Functions Intro + OVER()

The Magic Keyword: OVER()

sql
QUBITS OF DPK
1-- Regular aggregate (collapses rows)
2SELECT SUM(total_amount) FROM orders;
3Returns ONE number
4
5-- Window function (keeps all rows)
6SELECT SUM(total_amount) OVER() FROM orders;
7Returns same number next to EVERY row

OVER() with No Arguments

sql
QUBITS OF DPK
1SELECT order_id, total_amount,
2    SUM(total_amount) OVER() AS grand_total
3FROM orders;
4-- grand_total = 2790 on every row

OVER(ORDER BY) — Running Total

sql
QUBITS OF DPK
1SELECT order_id, total_amount,
2    SUM(total_amount) OVER(ORDER BY order_id) AS running_total
3FROM orders;
Result:
sql
QUBITS OF DPK
1order_id | total_amount | running_total
21        | 450          | 450
32        | 320          | 770
43        | 550          | 1320
54        | 280          | 1600
65        | 390          | 1990
76        | 620          | 2610
87        | 180          | 2790

Interview Traps ️

  • OVER() alone = entire table as one window
  • OVER(ORDER BY) = cumulative calculation
  • Window function result CANNOT go in WHERE → must use CTE
  • PARTITION BY ≠ GROUP BY → PARTITION BY never collapses rows

Topic 4 — OVER + PARTITION BY (Deep Dive)

What PARTITION BY Does

Divides table into groups, calculates separately per group. All rows preserved.
sql
QUBITS OF DPK
1SELECT order_id, customer_id, total_amount,
2    SUM(total_amount) OVER(PARTITION BY customer_id) AS customer_total
3FROM orders;
Result:
sql
QUBITS OF DPK
1order_id | customer_id | total_amount | customer_total
21        | 1           | 450          | 1070
36        | 1           | 620          | 1070  ← same partition
42        | 2           | 320          | 500
57        | 2           | 180          | 500

PARTITION BY + ORDER BY = Running Total Per Group

sql
QUBITS OF DPK
1SELECT order_id, customer_id, total_amount,
2    SUM(total_amount) OVER(
3        PARTITION BY customer_id
4        ORDER BY order_id
5    ) AS customer_running_total
6FROM orders;
Result:
sql
QUBITS OF DPK
1order_id | customer_id | total_amount | running_total
21        | 1           | 450          | 450   ← customer 1 starts
36        | 1           | 620          | 1070  ← customer 1 cumulative
42        | 2           | 320          | 320   ← customer 2 resets
57        | 2           | 180          | 500

The 3 Window Sizes

sql
QUBITS OF DPK
1OVER()                          → entire table
2OVER(PARTITION BY city)         → one per city, rows preserved
3OVER(PARTITION BY city          → cumulative per city
4     ORDER BY order_id)

Key Rule

sql
QUBITS OF DPK
1Without ORDER BY inside OVERfull partition aggregate (same for all rows)
2With ORDER BY inside OVER    → cumulative (grows row by row)

Interview Traps ️

  • PARTITION BY ≠ GROUP BY → never collapses rows
  • Multiple PARTITION BY columns → window per unique combination
  • ORDER BY inside OVER changes behavior completely

Topic 5 — ROW_NUMBER()

What is ROW_NUMBER()?

Assigns a unique sequential number to each row within a partition.
sql
QUBITS OF DPK
1ROW_NUMBER() OVER (
2    PARTITION BY column   -- restart numbering per group
3    ORDER BY column       -- defines the order (MANDATORY)
4)

Basic Example

sql
QUBITS OF DPK
1SELECT order_id, customer_id, total_amount,
2    ROW_NUMBER() OVER (
3        PARTITION BY customer_id
4        ORDER BY total_amount DESC
5    ) AS row_num
6FROM orders;
Result:
sql
QUBITS OF DPK
1order_id | customer_id | total_amount | row_num
26        | 1           | 620          | 1  ← customer 1 starts
31        | 1           | 450          | 2
42        | 2           | 320          | 1  ← customer 2 resets
57        | 2           | 180          | 2

Most Asked Pattern — Latest Record Per Group

sql
QUBITS OF DPK
1WITH ranked_orders AS (
2    SELECT order_id, customer_id, total_amount, order_date,
3        ROW_NUMBER() OVER (
4            PARTITION BY customer_id
5            ORDER BY order_date DESC
6        ) AS row_num
7    FROM orders
8)
9SELECT order_id, customer_id, total_amount, order_date
10FROM ranked_orders
11WHERE row_num = 1;

De-duplication Pattern — LeetCode #196

sql
QUBITS OF DPK
1WITH ranked AS (
2    SELECT id, email,
3        ROW_NUMBER() OVER (
4            PARTITION BY email
5            ORDER BY id ASC
6        ) AS row_num
7    FROM person
8)
9DELETE FROM person
10WHERE id IN (
11    SELECT id FROM ranked WHERE row_num > 1
12);

Key Properties

sql
QUBITS OF DPK
1✅ Always generates UNIQUE numbers — no ties ever
2✅ Numbering starts at 1 within each partition
3ORDER BY inside OVER() is MANDATORY
4❌ Does NOT handle ties — use RANK or DENSE_RANK for that

Interview Traps ️

  • ROW_NUMBER() requires ORDER BY inside OVER()
  • ROW_NUMBER() NEVER produces ties — even same values get different numbers
  • Cannot filter result in WHERE → wrap in CTE first
  • "Latest record per group" = ROW_NUMBER + PARTITION BY + ORDER BY DESC + WHERE row_num = 1

Topic 6 — RANK() & DENSE_RANK()

The Problem with ROW_NUMBER() for Rankings

Two employees with same salary get different numbers arbitrarily. Unfair. RANK and DENSE_RANK handle ties properly.

ROW_NUMBER vs RANK vs DENSE_RANK

sql
QUBITS OF DPK
1ROW_NUMBER  → always unique, ties broken arbitrarily
2RANK        → ties get same rank, GAP after tie (1,1,3)
3DENSE_RANK  → ties get same rank, NO gap (1,1,2)

RANK() Example

sql
QUBITS OF DPK
1SELECT customer_id, total_amount,
2    RANK() OVER (ORDER BY total_amount DESC) AS rnk
3FROM orders;

DENSE_RANK() Example

sql
QUBITS OF DPK
1SELECT customer_id, total_amount,
2    DENSE_RANK() OVER (ORDER BY total_amount DESC) AS dense_rnk
3FROM orders;

Top N Per Group — Most Asked MAANG Pattern

sql
QUBITS OF DPK
1WITH ranked AS (
2    SELECT order_id, customer_id, total_amount,
3        DENSE_RANK() OVER (
4            PARTITION BY customer_id
5            ORDER BY total_amount DESC
6        ) AS rnk
7    FROM orders
8)
9SELECT order_id, customer_id, total_amount
10FROM ranked
11WHERE rnk <= 2;

When to Use Which

sql
QUBITS OF DPK
1ROW_NUMBER  → De-duplication, pagination, latest record per group
2RANK        → Competitions where gaps matter (1st, 1st, 3rd)
3DENSE_RANK  → Top-N problems, leaderboards without gaps

Interview Traps ️

  • RANK leaves gaps after ties, DENSE_RANK never does
  • For Top-N always use DENSE_RANK — RANK may skip ranks at boundary
  • All three require ORDER BY inside OVER()
  • All three results need CTE wrapper before WHERE filter

Topic 7 — LEAD() and LAG()

What Are LEAD and LAG?

Access values from other rows relative to current row — without a self join.
sql
QUBITS OF DPK
1LAG(col, n)value from n rows BEFORE current row
2LEAD(col, n)value from n rows AFTER current row
3Default n = 1

LAG() — Compare with Previous Row

sql
QUBITS OF DPK
1SELECT order_id, total_amount, order_date,
2    LAG(total_amount, 1) OVER (ORDER BY order_date) AS prev_amount,
3    total_amount - LAG(total_amount, 1) OVER (ORDER BY order_date) AS diff
4FROM orders;
Result:
sql
QUBITS OF DPK
1order_id | total_amount | prev_amount | diff
21        | 450          | NULL        | NULLno previous row
32        | 320          | 450         | -130
43        | 550          | 320         | +230
54        | 280          | 550         | -270

LEAD() — Compare with Next Row

sql
QUBITS OF DPK
1SELECT order_id, total_amount,
2    LEAD(total_amount, 1) OVER (ORDER BY order_date) AS next_amount
3FROM orders;

LAG with Default Value — Avoid NULL

sql
QUBITS OF DPK
1SELECT order_id, total_amount,
2    LAG(total_amount, 1, 0) OVER (ORDER BY order_id) AS prev_amount
3    -- 0 returned when no previous row exists
4FROM orders;

LAG with PARTITION BY — Per Customer

sql
QUBITS OF DPK
1SELECT order_id, customer_id, total_amount,
2    LAG(total_amount) OVER (
3        PARTITION BY customer_id
4        ORDER BY order_date
5    ) AS prev_customer_order
6FROM orders;
7-- Each customer's LAG resets independently

Rising Temperature — LeetCode #197 (LAG approach)

sql
QUBITS OF DPK
1WITH weather_with_prev AS (
2    SELECT id, recordDate, temperature,
3        LAG(temperature) OVER (ORDER BY recordDate) AS prev_temp,
4        LAG(recordDate) OVER (ORDER BY recordDate) AS prev_date
5    FROM Weather
6)
7SELECT id
8FROM weather_with_prev
9WHERE temperature > prev_temp
10AND DATEDIFF(recordDate, prev_date) = 1;

Interview Traps ️

  • LAG/LEAD return NULL for first/last rows by default → provide default value
  • LAG with PARTITION BY resets per partition
  • ORDER BY inside OVER() is MANDATORY
  • LAG/LEAD result cannot be in WHERE → use CTE

Quick Reference — All Window Functions

Window Frame Reference

sql
QUBITS OF DPK
1ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW  -- start to current
2ROWS BETWEEN 2 PRECEDING AND CURRENT ROW          -- last 3 rows
3ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING          -- sliding window
4ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING  -- current to end

LeetCode Problems — Window Functions