Domain Understanding
An e-commerce platform like Amazon or Flipkart connects buyers to products and handles the complete purchase lifecycle: browsing a catalogue, placing an order, payment processing, and multi-item fulfilment with individual shipment tracking. The schema is complex because a single order can contain items from different sellers, each item can be shipped separately, and products can have variants like size and color that affect price and inventory independently.
The three hardest design challenges are variant modelling (how do you represent a shirt that comes in S/M/L and Red/Blue without creating a separate table per product?), inventory management at the variant level (each size/color combination has its own stock count), and the price snapshot problem (an order placed last year must show last year's prices, not today's).
The read-to-write ratio is extremely read-heavy for the catalogue. Writes are concentrated during checkout and shipment tracking updates.
Entity Map
Full Schema with Comments
1-- Category tree: categories can be nested arbitrarily deep
2-- Electronics > Phones > Android > Budget Android
3CREATE TABLE categories (
4 id BIGINT AUTO_INCREMENT PRIMARY KEY,
5 name VARCHAR(255) NOT NULL,
6 parent_id BIGINT NULL,
7 -- SELF-REFERENTIAL: NULL means this is a root category (top-level)
8 -- Same pattern as manager_id in Employee Management System
9 slug VARCHAR(255) UNIQUE,
10 -- URL-friendly identifier: 'budget-android-phones'
11 FOREIGN KEY (parent_id) REFERENCES categories(id)
12);
13
14CREATE TABLE sellers (
15 id BIGINT AUTO_INCREMENT PRIMARY KEY,
16 name VARCHAR(255) NOT NULL,
17 email VARCHAR(255) NOT NULL UNIQUE,
18 gstin CHAR(15) UNIQUE,
19 -- GST Identification Number: mandatory for Indian sellers
20 rating DECIMAL(3,2) DEFAULT 0.00,
21 -- Denormalized average: updated by background job
22 is_verified BOOLEAN NOT NULL DEFAULT FALSE,
23 created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
24 deleted_at DATETIME NULL
25);
26
27CREATE TABLE customers (
28 id BIGINT AUTO_INCREMENT PRIMARY KEY,
29 name VARCHAR(255) NOT NULL,
30 email VARCHAR(255) NOT NULL UNIQUE,
31 phone CHAR(10) UNIQUE,
32 created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
33 deleted_at DATETIME NULL
34);
35
36CREATE TABLE addresses (
37 id BIGINT AUTO_INCREMENT PRIMARY KEY,
38 customer_id BIGINT NOT NULL,
39 full_name VARCHAR(255) NOT NULL,
40 -- Recipient name may differ from account holder
41 address_line VARCHAR(500) NOT NULL,
42 city VARCHAR(100) NOT NULL,
43 state VARCHAR(100) NOT NULL,
44 pincode CHAR(6) NOT NULL,
45 is_default BOOLEAN NOT NULL DEFAULT FALSE,
46 deleted_at DATETIME NULL,
47 FOREIGN KEY (customer_id) REFERENCES customers(id)
48);
49
50-- A product is the base definition: "Apple iPhone 15"
51-- NOT a specific sellable item. That is a variant.
52CREATE TABLE products (
53 id BIGINT AUTO_INCREMENT PRIMARY KEY,
54 seller_id BIGINT NOT NULL,
55 name VARCHAR(500) NOT NULL,
56 description TEXT,
57 brand VARCHAR(255),
58 base_price DECIMAL(12,2) NOT NULL,
59 -- The starting reference price. Actual selling price is on the variant.
60 is_active BOOLEAN NOT NULL DEFAULT TRUE,
61 created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
62 updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
63 deleted_at DATETIME NULL,
64 FOREIGN KEY (seller_id) REFERENCES sellers(id)
65);
66
67-- M:M: a product can belong to multiple categories
68-- An iPhone is in both 'Phones' and 'Apple Products'
69CREATE TABLE product_categories (
70 product_id BIGINT NOT NULL,
71 category_id BIGINT NOT NULL,
72 PRIMARY KEY (product_id, category_id),
73 FOREIGN KEY (product_id) REFERENCES products(id),
74 FOREIGN KEY (category_id) REFERENCES categories(id)
75);
76
77-- Flexible product specifications without altering the schema for each category
78-- An iPhone has RAM, Storage, Display Size. A T-shirt has Color, Size, Material.
79-- Each combination of attribute_name + value is one row.
80CREATE TABLE product_attributes (
81 id BIGINT AUTO_INCREMENT PRIMARY KEY,
82 product_id BIGINT NOT NULL,
83 attribute_name VARCHAR(100) NOT NULL,
84 -- 'Color', 'RAM', 'Storage', 'Material'
85 attribute_value VARCHAR(255) NOT NULL,
86 -- 'Midnight Black', '8GB', '256GB', 'Cotton'
87 FOREIGN KEY (product_id) REFERENCES products(id)
88);
89
90-- A variant is the specific sellable unit: iPhone 15 + 256GB + Midnight Black
91-- This is what the customer actually adds to cart and buys
92CREATE TABLE product_variants (
93 id BIGINT AUTO_INCREMENT PRIMARY KEY,
94 product_id BIGINT NOT NULL,
95 sku VARCHAR(100) NOT NULL UNIQUE,
96 -- SKU: Stock Keeping Unit. Unique identifier for this specific variant.
97 -- Used in warehouse, shipping, and invoicing systems.
98 variant_name VARCHAR(255) NOT NULL,
99 -- Human-readable: '256GB / Midnight Black'
100 selling_price DECIMAL(12,2) NOT NULL,
101 -- The actual price customers pay. May differ from base_price due to discounts.
102 is_active BOOLEAN NOT NULL DEFAULT TRUE,
103 FOREIGN KEY (product_id) REFERENCES products(id)
104);
105
106-- Stock count is per variant, not per product
107-- iPhone 15 in 256GB/Black might have 50 units while 512GB/Gold has 3 units
108CREATE TABLE inventory (
109 variant_id BIGINT PRIMARY KEY,
110 -- One inventory row per variant. PK = FK.
111 quantity_in_stock INT NOT NULL DEFAULT 0,
112 reorder_level INT NOT NULL DEFAULT 10,
113 -- Alert when stock drops below this threshold
114 updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
115 FOREIGN KEY (variant_id) REFERENCES product_variants(id)
116);
117
118CREATE TABLE orders (
119 id BIGINT AUTO_INCREMENT PRIMARY KEY,
120 customer_id BIGINT NOT NULL,
121 shipping_address_id BIGINT NOT NULL,
122 order_status ENUM('PENDING','CONFIRMED','PROCESSING','PARTIALLY_SHIPPED','SHIPPED','DELIVERED','CANCELLED','RETURNED') NOT NULL DEFAULT 'PENDING',
123 subtotal DECIMAL(12,2) NOT NULL,
124 shipping_charge DECIMAL(8,2) NOT NULL DEFAULT 0.00,
125 discount_amount DECIMAL(8,2) NOT NULL DEFAULT 0.00,
126 total_amount DECIMAL(12,2) NOT NULL,
127 placed_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
128 updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
129 FOREIGN KEY (customer_id) REFERENCES customers(id),
130 FOREIGN KEY (shipping_address_id) REFERENCES addresses(id)
131);
132
133CREATE TABLE order_items (
134 id BIGINT AUTO_INCREMENT PRIMARY KEY,
135 order_id BIGINT NOT NULL,
136 variant_id BIGINT NOT NULL,
137 product_name VARCHAR(500) NOT NULL,
138 -- SNAPSHOT: product name at time of order
139 variant_name VARCHAR(255) NOT NULL,
140 -- SNAPSHOT: variant description at time of order
141 unit_price DECIMAL(12,2) NOT NULL,
142 -- SNAPSHOT: selling price at time of order. Prices change; receipts must not.
143 quantity INT NOT NULL DEFAULT 1,
144 item_status ENUM('PENDING','CONFIRMED','SHIPPED','DELIVERED','CANCELLED','RETURNED') NOT NULL DEFAULT 'PENDING',
145 -- Item-level status: different items in an order can have different statuses
146 FOREIGN KEY (order_id) REFERENCES orders(id),
147 FOREIGN KEY (variant_id) REFERENCES product_variants(id)
148);
149
150-- Each order_item can generate an independent shipment
151-- Items from different sellers ship separately
152CREATE TABLE shipments (
153 id BIGINT AUTO_INCREMENT PRIMARY KEY,
154 order_item_id BIGINT NOT NULL,
155 tracking_number VARCHAR(100) UNIQUE,
156 -- Provided by the logistics carrier (Delhivery, BlueDart, etc.)
157 carrier VARCHAR(100),
158 status ENUM('AWAITING_PICKUP','PICKED_UP','IN_TRANSIT','OUT_FOR_DELIVERY','DELIVERED','RETURNED') NOT NULL DEFAULT 'AWAITING_PICKUP',
159 shipped_at DATETIME NULL,
160 expected_delivery DATETIME NULL,
161 delivered_at DATETIME NULL,
162 FOREIGN KEY (order_item_id) REFERENCES order_items(id)
163);
164
165CREATE TABLE payments (
166 id BIGINT AUTO_INCREMENT PRIMARY KEY,
167 order_id BIGINT NOT NULL UNIQUE,
168 amount DECIMAL(12,2) NOT NULL,
169 method ENUM('UPI','CARD','NETBANKING','WALLET','COD','EMI') NOT NULL,
170 status ENUM('PENDING','SUCCESS','FAILED','REFUNDED') NOT NULL DEFAULT 'PENDING',
171 gateway_txn_id VARCHAR(100) UNIQUE NULL,
172 paid_at DATETIME NULL,
173 FOREIGN KEY (order_id) REFERENCES orders(id)
174);
175
176-- Only verified purchasers can review a product
177-- Verified = they have a DELIVERED order_item for this variant
178CREATE TABLE reviews (
179 id BIGINT AUTO_INCREMENT PRIMARY KEY,
180 product_id BIGINT NOT NULL,
181 customer_id BIGINT NOT NULL,
182 order_item_id BIGINT NOT NULL,
183 -- FK to order_item: proves this is a verified purchase
184 rating TINYINT NOT NULL CHECK (rating BETWEEN 1 AND 5),
185 title VARCHAR(255),
186 body TEXT,
187 created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
188 UNIQUE (product_id, customer_id),
189 -- One review per customer per product (across all variants)
190 FOREIGN KEY (product_id) REFERENCES products(id),
191 FOREIGN KEY (customer_id) REFERENCES customers(id),
192 FOREIGN KEY (order_item_id) REFERENCES order_items(id)
193);
194
195-- Indexes for the hottest query paths
196CREATE INDEX idx_products_seller ON products(seller_id);
197CREATE INDEX idx_variants_product ON product_variants(product_id);
198CREATE INDEX idx_product_attributes_product ON product_attributes(product_id);
199CREATE INDEX idx_orders_customer ON orders(customer_id);
200CREATE INDEX idx_order_items_order ON order_items(order_id);
201CREATE INDEX idx_shipments_order_item ON shipments(order_item_id);
202CREATE INDEX idx_categories_parent ON categories(parent_id);
203-- Self-referential FK: needed to traverse the category tree
Key Design Decisions
The Variant Modelling Problem — Deep Explanation
This is the most complex part of any product catalogue schema and the question most likely to come up in interviews. Consider a T-shirt available in 3 sizes (S, M, L) and 3 colours (Red, Blue, Green). That is 9 distinct sellable units, each with its own price and stock level.
The naive approach is to create 9 separate product rows. This collapses all 9 items into one product page on the website, makes search nonsensical, and duplicates all shared product data (description, brand, category) 9 times.
The correct approach is a parent products row (the concept) and 9 product_variants rows (the sellable units). Shared data lives on products. Per-combination data (price, SKU, stock) lives on product_variants. The product detail page queries one products row and all its variants. The customer selects a combination and adds a product_variant to the cart.
Indexing Strategy
Common Traps to Avoid
- No product variants — Modelling every size/color combination as a separate product row duplicates all shared data and breaks the concept of a single product page.
- Inventory at product level — Inventory must be at variant level. A shirt with 100 units in stock across 3 sizes has a different meaning than 100 units of size Medium specifically.
- No price snapshot on order_items — Price changes corrupt historical receipts. Always snapshot at order time.
- Flat category table without self-referential FK — Category trees are hierarchical. A single category column on products cannot represent Electronics > Phones > Android.
- EAV for core structured attributes — EAV is appropriate for flexible, variable-schema attributes. Do not use EAV for attributes that are queried or filtered frequently (like price or stock) — those belong as proper columns.
Interview Q&A
Q: How do you model products that come in different sizes and colors?
With a two-level structure: products for the shared concept and product_variants for each specific sellable combination. Each variant has its own SKU, price, and inventory row. Shared data (name, description, brand) lives on the product. Per-combination data lives on the variant.
Q: How do you manage inventory for products with variants?
Inventory is tracked at the variant level in a separate inventory table with variant_id as both PK and FK. This enforces exactly one inventory record per variant. Stock decrements happen on the specific variant row when an order is confirmed.
Q: Why does a review link to order_item_id?
To enforce verified-purchase reviews at the database level. The FK to order_items proves the reviewer actually bought and received this product. Without this constraint, any customer could review any product regardless of purchase history.
Q: How does your schema handle an order where items from different sellers ship separately?
Each order_item has its own item_status. Each order_item can generate an independent shipments row with its own tracking number and carrier. This allows Item A (from Seller X) to be DELIVERED while Item B (from Seller Y) is still IN_TRANSIT within the same order.
Q: Why use EAV for product_attributes instead of columns?
Product attributes vary completely by category. A phone has RAM, Storage, and Display Size. A T-shirt has Size, Colour, and Material. A rigid column schema would require hundreds of nullable columns and an ALTER TABLE for every new attribute type a new product category needs. EAV stores each attribute as a row, making the schema extensible without DDL changes.
ANKI Cards
Q: In an e-commerce schema, what is the difference between products and product_variants?
A: products represents the conceptual item (iPhone 15) with shared attributes. product_variants represents a specific sellable unit (iPhone 15 + 256GB + Black) with its own SKU, price, and inventory. Customers buy variants; product pages display the product with all its variants.
Q: Why is inventory tracked at the variant level, not the product level?
A: A product with variants (S/M/L in Red/Blue) has independent stock for each combination. Size Medium in Red might be sold out while Size Large in Blue has 50 units. Product-level inventory collapses these distinct stock counts into a meaningless aggregate.
Q: What is the EAV pattern and when is it appropriate?
A: Entity-Attribute-Value stores variable-schema attributes as rows (entity_id, attribute_name, attribute_value) instead of columns. Appropriate for flexible, category-specific product specifications where the attribute set varies too widely for a fixed schema. Not appropriate for attributes that are frequently filtered or sorted, as EAV rows are hard to index for arbitrary queries.