Domain Understanding
IRCTC is India's railway booking system — one of the most complex transactional schemas you can design. The core challenge is seat inventory management under high concurrency. Millions of users try to book the same limited seats simultaneously during Tatkal windows. The schema must handle this without double-booking, without phantom reads, and without losing a booking under race conditions.
The secondary challenge is that a train is not a simple A-to-B journey. A train travels through multiple stations in sequence. A passenger can board at any intermediate station and deboard at another. Two passengers can occupy the same seat on the same train for different route segments. This is the partial seat occupancy problem and it fundamentally shapes the schema.
Entity Map
Full Schema with Comments
1CREATE TABLE stations (
2 id BIGINT AUTO_INCREMENT PRIMARY KEY,
3 code CHAR(5) NOT NULL UNIQUE,
4 -- Indian station codes are 2-5 characters: NDLS, CSTM, MAS, BCT
5 -- CHAR(5) not VARCHAR: fixed-length codes perform better for lookups and joins
6 name VARCHAR(255) NOT NULL,
7 city VARCHAR(100) NOT NULL,
8 state VARCHAR(100) NOT NULL,
9 zone VARCHAR(10)
10 -- Indian Railways zone: CR, NR, SR, WR, etc.
11);
12
13CREATE TABLE trains (
14 id BIGINT AUTO_INCREMENT PRIMARY KEY,
15 number CHAR(5) NOT NULL UNIQUE,
16 -- Train number is always 5 digits in India: '12301' (Rajdhani)
17 name VARCHAR(255) NOT NULL,
18 train_type ENUM('RAJDHANI','SHATABDI','EXPRESS','PASSENGER','DURONTO') NOT NULL,
19 created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
20);
21
22-- The ordered list of stations a train passes through
23-- stop_sequence defines the order: 1 = source, N = destination
24CREATE TABLE train_stations (
25 id BIGINT AUTO_INCREMENT PRIMARY KEY,
26 train_id BIGINT NOT NULL,
27 station_id BIGINT NOT NULL,
28 stop_sequence INT NOT NULL,
29 -- 1 = first stop (origin), 2 = second stop, etc.
30 -- This is how we know which direction the train is going
31 arrival_time TIME NULL,
32 -- NULL for the first station (no arrival, only departure)
33 departure_time TIME NULL,
34 -- NULL for the last station (no departure, only arrival)
35 distance_from_origin_km INT NOT NULL DEFAULT 0,
36 -- Used to calculate fare for partial journeys
37
38 UNIQUE (train_id, stop_sequence),
39 -- A train can only have one stop at each sequence position
40 UNIQUE (train_id, station_id),
41 -- A train passes through each station exactly once
42 FOREIGN KEY (train_id) REFERENCES trains(id),
43 FOREIGN KEY (station_id) REFERENCES stations(id)
44);
45
46-- A coach belongs to a specific train: S1, S2, A1, B1
47CREATE TABLE coaches (
48 id BIGINT AUTO_INCREMENT PRIMARY KEY,
49 train_id BIGINT NOT NULL,
50 coach_name VARCHAR(10) NOT NULL,
51 -- 'S1', 'S2', 'A1', 'B2', 'GEN'
52 coach_class ENUM('SL','3A','2A','1A','CC','GEN') NOT NULL,
53 -- SL = Sleeper, 3A = 3 Tier AC, 2A = 2 Tier AC, 1A = First AC
54 -- CC = Chair Car, GEN = General
55 total_seats INT NOT NULL,
56 UNIQUE (train_id, coach_name),
57 FOREIGN KEY (train_id) REFERENCES trains(id)
58);
59
60-- Individual seats within a coach
61CREATE TABLE seats (
62 id BIGINT AUTO_INCREMENT PRIMARY KEY,
63 coach_id BIGINT NOT NULL,
64 seat_number VARCHAR(10) NOT NULL,
65 -- '1', '2', '63' etc.
66 berth_type ENUM('LOWER','MIDDLE','UPPER','SIDE_LOWER','SIDE_UPPER','WINDOW','AISLE') NOT NULL,
67 UNIQUE (coach_id, seat_number),
68 FOREIGN KEY (coach_id) REFERENCES coaches(id)
69);
70
71-- One schedule row = one physical run of the train on a specific date
72-- This is the entity that holds real-time seat availability
73CREATE TABLE train_schedules (
74 id BIGINT AUTO_INCREMENT PRIMARY KEY,
75 train_id BIGINT NOT NULL,
76 journey_date DATE NOT NULL,
77 -- The date on which this train departs from its origin station
78 status ENUM('ON_TIME','DELAYED','CANCELLED') NOT NULL DEFAULT 'ON_TIME',
79 UNIQUE (train_id, journey_date),
80 -- A train runs at most once per day
81 FOREIGN KEY (train_id) REFERENCES trains(id)
82);
83
84-- Users of the booking platform
85CREATE TABLE users (
86 id BIGINT AUTO_INCREMENT PRIMARY KEY,
87 name VARCHAR(255) NOT NULL,
88 email VARCHAR(255) NOT NULL UNIQUE,
89 phone CHAR(10) NOT NULL UNIQUE,
90 -- Indian mobile numbers are exactly 10 digits
91 created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
92 deleted_at DATETIME NULL
93);
94
95-- One booking = one transaction for one user on one train on one date
96-- A booking can cover multiple passengers (family booking)
97CREATE TABLE bookings (
98 id BIGINT AUTO_INCREMENT PRIMARY KEY,
99 user_id BIGINT NOT NULL,
100 schedule_id BIGINT NOT NULL,
101 -- Which run of the train this booking is on
102 boarding_station_id BIGINT NOT NULL,
103 deboarding_station_id BIGINT NOT NULL,
104 -- The actual segment the passengers are travelling
105 -- Both are FKs to train_stations (not stations directly)
106 -- This is how partial journey bookings work
107 coach_class ENUM('SL','3A','2A','1A','CC','GEN') NOT NULL,
108 booking_status ENUM('CONFIRMED','WAITLISTED','RAC','CANCELLED') NOT NULL DEFAULT 'WAITLISTED',
109 -- WAITLISTED: no seat yet, waiting for cancellations
110 -- RAC: Reservation Against Cancellation, shares a berth
111 pnr CHAR(10) NOT NULL UNIQUE,
112 -- PNR: 10-digit unique booking reference, like a surrogate key visible to users
113 total_fare DECIMAL(10,2) NOT NULL,
114 booked_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
115 updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
116 FOREIGN KEY (user_id) REFERENCES users(id),
117 FOREIGN KEY (schedule_id) REFERENCES train_schedules(id),
118 FOREIGN KEY (boarding_station_id) REFERENCES train_stations(id),
119 FOREIGN KEY (deboarding_station_id) REFERENCES train_stations(id)
120);
121
122-- One row per passenger on the booking
123-- A family of 4 = 1 booking row + 4 booking_passengers rows
124CREATE TABLE booking_passengers (
125 id BIGINT AUTO_INCREMENT PRIMARY KEY,
126 booking_id BIGINT NOT NULL,
127 name VARCHAR(255) NOT NULL,
128 age TINYINT NOT NULL,
129 gender ENUM('MALE','FEMALE','OTHER') NOT NULL,
130 seat_id BIGINT NULL,
131 -- NULL when status is WAITLISTED (no seat assigned yet)
132 -- Set to a specific seat_id when CONFIRMED
133 berth_preference ENUM('LOWER','MIDDLE','UPPER','SIDE_LOWER','SIDE_UPPER','NO_PREFERENCE') NOT NULL DEFAULT 'NO_PREFERENCE',
134 FOREIGN KEY (booking_id) REFERENCES bookings(id),
135 FOREIGN KEY (seat_id) REFERENCES seats(id)
136);
137
138-- Payment record for each booking
139CREATE TABLE payments (
140 id BIGINT AUTO_INCREMENT PRIMARY KEY,
141 booking_id BIGINT NOT NULL UNIQUE,
142 -- UNIQUE: one payment record per booking
143 amount DECIMAL(10,2) NOT NULL,
144 currency CHAR(3) NOT NULL DEFAULT 'INR',
145 payment_method ENUM('UPI','CARD','NETBANKING','WALLET') NOT NULL,
146 payment_status ENUM('PENDING','SUCCESS','FAILED','REFUNDED') NOT NULL DEFAULT 'PENDING',
147 gateway_txn_id VARCHAR(100) UNIQUE,
148 -- Transaction ID from the payment gateway (Razorpay, PayU etc.)
149 -- UNIQUE: no two bookings can map to the same gateway transaction
150 paid_at DATETIME NULL,
151 -- NULL until payment is confirmed
152 FOREIGN KEY (booking_id) REFERENCES bookings(id)
153);
154
155-- Indexes for the most critical queries
156CREATE INDEX idx_bookings_user ON bookings(user_id);
157CREATE INDEX idx_bookings_schedule ON bookings(schedule_id);
158CREATE INDEX idx_bookings_pnr ON bookings(pnr);
159-- PNR lookup is the most frequent user query — must be instant
160CREATE INDEX idx_booking_passengers_booking ON booking_passengers(booking_id);
161CREATE INDEX idx_train_stations_train ON train_stations(train_id);
162CREATE INDEX idx_schedules_train_date ON train_schedules(train_id, journey_date);
Key Design Decisions
The Seat Availability Problem — Deep Explanation
This is the hardest part of the IRCTC schema. Unlike a cinema seat (one seat = one booking), a train seat can be used by different passengers on different segments of the same journey on the same day.
For example, seat S1/23 on train 12301 running on April 15:
- Passenger A travels from Delhi (stop 1) to Kanpur (stop 3) — confirmed on seat S1/23
- Passenger B travels from Kanpur (stop 3) to Kolkata (stop 8) — can also be confirmed on S1/23
To check if seat S1/23 is available for a new booking from stop X to stop Y, you query:
In production IRCTC uses a separate seat_availability denormalized table updated by a background process, because running this query for thousands of seats during a Tatkal rush would be catastrophically slow.
Concurrency and Booking Integrity
At 10am during Tatkal opening, 500,000 users simultaneously try to book the last 10 seats. Without concurrency control, multiple users will read "1 seat available", all attempt to book it, and you get 50,000 confirmed bookings for 1 seat.
The production solution uses a combination of optimistic locking (check a version number before confirming) and SELECT FOR UPDATE (pessimistic lock on the specific seat row during the transaction window). The schema supports this through the booking_status state machine — a booking starts as WAITLISTED, and a separate confirmation step moves it to CONFIRMED only if the seat is still available at that moment, inside a transaction.
Indexing Strategy
Common Traps to Avoid
- Storing seats as a count column on train_schedules — available_seats INT is tempting but wrong. It creates a race condition: two concurrent transactions both read 1, both decrement to 0, both commit. You now have -1 available seats. Model individual seats instead.
- Linking boarding_station FK directly to stations — This loses the train context. Boarding station must be a stop on the specific train. Link to train_stations which carries both the station and the train.
- One passenger per booking — Real bookings cover families. Model bookings (transaction) separately from booking_passengers (travellers).
- No PNR separate from internal ID — Exposing internal BIGINT IDs to users is a security and business risk. PNR is the external contract.
- Using FLOAT for fare — Always DECIMAL for money.
Interview Q&A
Q: How do you prevent double-booking of a seat?
Two-step approach. First, individual seats are modeled as rows in the seats table, so availability is checked at the row level. Second, the confirmation step uses SELECT FOR UPDATE inside a transaction to lock the specific seat row, preventing any concurrent transaction from reading it as available. The UNIQUE constraint on (seat_id, schedule_segment_overlap) at the application level serves as a last-resort guard.
Q: How do you check if a seat is available for a specific journey segment?
By querying booking_passengers joined to bookings for the schedule, and checking if any existing confirmed booking has an overlapping segment. Two bookings overlap if one boards before the other deboard and deboard after the other boards. If no overlapping booking exists for that seat, it is available.
Q: Why is train_schedules a separate table?
Because a train has a static definition (number, name, route) and a dynamic runtime state (is it running today? is it delayed?). Separating them allows cancelling a specific day's run without touching the train definition. It also allows generating schedules for future dates independently.
Q: Why does a booking link to train_stations for boarding/deboarding instead of stations?
Because boarding at Mumbai Central only makes sense in the context of a specific train route. train_stations represents Mumbai Central as a stop on train 12301, which carries the train context, the stop sequence, and the distance from origin needed for fare calculation. stations alone just knows Mumbai Central exists.
Q: How would you handle the Tatkal booking rush (high concurrency)?
In production: a Redis-based seat locking layer holds a distributed lock on each seat for 5-10 minutes during the booking flow. The DB transaction only runs at the final confirmation step. This prevents 500,000 simultaneous DB transactions from overwhelming the database and ensures only one user can confirm a given seat at a time.
ANKI Cards
Q: In IRCTC schema, why does a booking link to train_stations for boarding/deboarding instead of directly to stations?
A: Because the boarding point only makes sense relative to a specific train route. train_stations carries the train context + stop sequence + distance from origin. stations alone only knows the station exists, not where it sits in any route.
Q: What is the partial seat occupancy problem in railway booking?
A: The same physical seat can be legitimately occupied by different passengers for different journey segments on the same day. Seat availability must be checked as a segment overlap query, not a simple "is this seat taken" boolean.
Q: How do you prevent race conditions in high-concurrency seat booking?
A: Use SELECT FOR UPDATE to acquire a pessimistic lock on the seat row inside a transaction. Combined with a Redis distributed lock upstream to prevent 500K simultaneous DB transactions, only one user can confirm a seat at a time.