SQL 8 : Indexing
D
Qubits of DPK
April 11, 2026
Core DBMS
1. Recap of SQL 7
Previously we learned subqueries, correlated subqueries, EXISTS / NOT EXISTS, IN / ANY / ALL, and Views. Now we move to something completely different. Instead of writing queries, we learn how to make queries faster. That concept is called Indexing.
2. Why Indexing Exists
Imagine a table with 10 rows — finding a record is easy. But what about 50 million rows?
sql
QUBITS OF DPK
Without indexing, the database must check every single row from 1 to 50,000,000. This is called a Full Table Scan, and it is very slow.
3. Real World Analogy
Think about a book. If you want to find the word "Database", you could read every page from the beginning — or you could check the index page:
sql
QUBITS OF DPK
You immediately jump to the right page. Database indexes work exactly like this.
4. What is an Index?
An index is a data structure that helps the database locate rows faster. If we frequently search by email, we create an index on that column:
sql
QUBITS OF DPK
Now the database can locate email values without scanning the entire table.
5. Full Table Scan vs Index Lookup
Without an index, a query like this:
sql
QUBITS OF DPK
forces the database to scan every row — time complexity O(N).
With an index, the database looks into the index structure first and jumps directly to the matching row — time complexity O(log N). That is a massive performance improvement at scale.
6. Data Structures Used for Indexing
Most relational databases use B+ Trees as their index structure. Some databases also support hash indexes, but B+ Trees are by far the most common.
Why not use Hash Maps? Hash maps allow O(1) lookups, which sounds faster. However, they cannot support range queries, sorting, or partial scans. A query like:
sql
QUBITS OF DPK
cannot be served efficiently by a hash index. B+ Trees handle this naturally because values are stored in sorted order.
7. How B+ Trees Work
A B+ Tree organizes values in a sorted tree structure:
sql
QUBITS OF DPK
Searching for 50 takes only a few steps — go right at 40, go left at 60, found at 50. This is why B+ Trees scale well even to billions of rows.
8. Clustered vs Non-Clustered Index
A clustered index determines the physical order of rows on disk. Most databases automatically create a clustered index on the primary key, so rows are stored in primary key order. There can only be one clustered index per table.
A non-clustered index is a separate structure that stores pointers back to the actual rows:
sql
QUBITS OF DPK
You can have many non-clustered indexes on a single table.
9. Composite Index
When queries filter on multiple columns together, a composite index covers them in one structure:
sql
QUBITS OF DPK
The Left Prefix Rule — this is critical to understand. The index (customer_id, order_date) works for queries filtering on customer_id alone, or customer_id + order_date together. It does not work efficiently for order_date alone, because the left prefix is missing.
sql
QUBITS OF DPK
10. Multi-Column Index Example
sql
QUBITS OF DPK
This optimizes all of the following queries because they all begin from the leftmost column:
sql
QUBITS OF DPK
11. Downsides of Indexing
Indexes are powerful but come with real trade-offs that matter in production.
Extra storage — indexes consume significant disk space. A 100 GB table may have 40 GB of associated indexes on top of it.
Slower writes — every INSERT, UPDATE, or DELETE must also update all relevant indexes. The more indexes you have, the heavier each write operation becomes.
Maintenance overhead — too many indexes slow down writes noticeably and increase the complexity of database maintenance.
12. Indexing Strings
Indexing very long text columns is inefficient. For a column like product_description TEXT, prefix indexing is a better approach — index only the first N characters:
sql
QUBITS OF DPK
This indexes only the first 20 characters of the title, which is usually enough to narrow down results significantly.
13. When Indexes Are Not Useful
Not every column is a good index candidate. Consider this query:
sql
QUBITS OF DPK
If 95% of rows are CREDIT, the database still has to scan most of the table even with an index — so the index provides no real benefit.
The key concept here is selectivity — the ratio of unique values in a column. High selectivity columns like email or user_id are excellent candidates. Low selectivity columns like gender, status, or boolean flags generally are not.
14. How to Identify Good Index Candidates
Columns that frequently appear in WHERE, JOIN, ORDER BY, or GROUP BY clauses are strong candidates. For example:
sql
QUBITS OF DPK
sql
QUBITS OF DPK
15. Interview Questions
Why not index every column? Because indexes increase storage usage, write latency, and maintenance cost. Each write must update every index on the table.
When is indexing not useful? When the column has low selectivity — meaning most rows share the same value, so the database ends up scanning most of the table anyway.
What is the Left Prefix Rule? A composite index (A, B, C) can only be used by queries that filter starting from the leftmost column. A query filtering only on B or C will not benefit from this index.
16. Quick Revision
sql
QUBITS OF DPK