SQL 4 : Joins

D

Qubits of DPK

April 11, 2026

Core DBMS

1. Recap of SQL 3

Previously we learned how to filter data with WHERE, remove duplicates with DISTINCT, search patterns with LIKE, sort results with ORDER BY, paginate with LIMIT / OFFSET, and update or delete data. Now we move to one of the most important concepts in relational databases — combining data from multiple tables using Joins.

2. Why Joins Are Needed

In real systems, data is deliberately spread across multiple tables to avoid redundancy. A customer's name is stored once in a customers table, and their orders are stored separately in an orders table — linked by a shared key. This design is clean and efficient, but it means that whenever you need information from both tables together, you have to join them.
Consider this example. You have a customers table and an orders table:
Customers
Orders
If you want to see each customer's name alongside their order amount, neither table alone can answer that question. You need to bring them together — and that is exactly what a Join does.

3. What is a Join?

A Join combines rows from two or more tables based on a related column they share. The relationship here is orders.customer_id → customers.customer_id. The general syntax follows this pattern:
sql
QUBITS OF DPK
1SELECT columns
2FROM table1
3JOIN table2
4ON table1.column = table2.column;
The ON clause is where you define how the two tables relate to each other. Getting this right is everything — a wrong join condition produces wrong results, often silently.

4. INNER JOIN

An INNER JOIN returns only the rows that have a match in both tables. Rows that exist in one table but not the other are completely excluded from the result.
sql
QUBITS OF DPK
1SELECT customers.name, orders.order_id
2FROM customers
3INNER JOIN orders ON customers.id = orders.customer_id;
Notice that order 103 (belonging to customer_id 3) is absent from the result because there is no customer with id 3 in the customers table. INNER JOIN quietly drops unmatched rows — which is sometimes exactly what you want, but can be a trap if you are not expecting it.

5. LEFT JOIN

A LEFT JOIN returns all rows from the left table, and fills in the matching data from the right table where it exists. When there is no match, the right-table columns return NULL instead of disappearing entirely.
sql
QUBITS OF DPK
1SELECT customers.name, orders.order_id
2FROM customers
3LEFT JOIN orders ON customers.id = orders.customer_id;
Bob appears in the result even though he has no orders — his order_id is simply NULL. This is the join you reach for whenever you need a complete list from one side, regardless of whether matching rows exist on the other side. A classic use case is finding customers who have never placed an order: just filter WHERE orders.order_id IS NULL after the LEFT JOIN.

6. RIGHT JOIN

A RIGHT JOIN is the mirror image of a LEFT JOIN — it returns all rows from the right table, and fills in matching data from the left table where available.
sql
QUBITS OF DPK
1SELECT customers.name, orders.order_id
2FROM customers
3RIGHT JOIN orders ON customers.id = orders.customer_id;
Order 103 appears even though no matching customer exists. In practice, RIGHT JOIN is rarely used because you can always rewrite it as a LEFT JOIN by swapping the table order — and most engineers find LEFT JOIN more intuitive to read.

7. SELF JOIN

A Self Join is when a table joins with itself. This sounds strange at first, but it is the natural solution when rows within the same table have a relationship with each other. The classic example is an employee-manager hierarchy, where both the employee and their manager live in the same employees table.
sql
QUBITS OF DPK
1SELECT
2    e.name AS Employee,
3    m.name AS Manager
4FROM employees e
5JOIN employees m ON e.manager_id = m.employee_id;
The trick is using aliases (e and m) to treat the same table as if it were two separate tables. Without aliases, the database would have no way to distinguish which copy of the table you are referring to in each part of the query.

8. CROSS JOIN

A Cross Join returns every possible combination of rows from two tables — also called a Cartesian Product. If Table A has 2 rows and Table B has 2 rows, the result has 4 rows.
sql
QUBITS OF DPK
1SELECT *
2FROM colors
3CROSS JOIN sizes;
Cross joins have legitimate uses — generating all product variant combinations (size × color), or building test datasets — but they are dangerous when used accidentally, since a cross join between two large tables can produce billions of rows and crash a query.

9. Joining Multiple Tables

Real production queries often join three, four, or more tables in a single query. The principle is the same — just chain additional JOIN clauses, each with its own ON condition.
sql
QUBITS OF DPK
1-- Netflix: show what each user watched and when
2SELECT
3    users.name,
4    movies.title,
5    watch_history.watch_date
6FROM users
7JOIN watch_history ON users.id = watch_history.user_id
8JOIN movies ON watch_history.movie_id = movies.id;
The key is following the chain of relationships: users connects to watch_history via user_id, and watch_history connects to movies via movie_id. Always map out the relationship chain before writing the query.

10. How to Approach Any Join Problem

When you face a join problem in an interview or in production, work through these four steps before touching the keyboard.
Step 1 — Identify the tables you need data from (e.g. customers, orders).
Step 2 — Identify the relationship between them (e.g. orders.customer_id → customers.id).
Step 3 — Write the join condition in the ON clause.
Step 4 — Select only the columns you actually need in SELECT.
This methodical approach prevents the most common mistake in join queries — joining on the wrong column.

11. Common Mistakes to Avoid

Accidentally producing a cross join by listing tables in FROM without a join condition:
sql
QUBITS OF DPK
1-- Wrong: produces a Cartesian Product
2SELECT * FROM users, orders;
3
4-- Correct: explicit join with condition
5SELECT * FROM users JOIN orders ON users.id = orders.user_id;
Joining on the wrong column — a mistake that produces results that look correct but are silently wrong:
sql
QUBITS OF DPK
1-- Wrong: joins on the PK of both tables, not the FK relationship
2ON users.id = orders.id
3
4-- Correct
5ON users.id = orders.user_id

12. Interview Questions

What is the difference between INNER JOIN and LEFT JOIN? INNER JOIN returns only rows that have a match in both tables — unmatched rows are excluded entirely. LEFT JOIN returns all rows from the left table regardless, filling unmatched right-table columns with NULL. Use LEFT JOIN when you need a complete list from one side and optional data from the other.
What is a Self Join? Joining a table with itself using aliases. Used when rows in the same table relate to each other — the most common example being employee-manager relationships stored in a single employees table.
What is a Cartesian Product? The result of combining every row from one table with every row from another, producing M × N rows total. It occurs explicitly with CROSS JOIN and accidentally when you forget to write a join condition.

13. Quick Revision

sql
QUBITS OF DPK
1INNER JOIN  → matching rows from both tables only
2LEFT JOINall left rows + matching right rows (NULL if no match)
3RIGHT JOINall right rows + matching left rows (NULL if no match)
4SELF JOINtable joined with itself using aliases
5CROSS JOIN  → every combination of rows (Cartesian Product)
6
7ON          → defines the join relationship
8WHERE       → filters the result after joining