Domain Understanding
An Employee Management System tracks the organizational structure of a company — who works where, who reports to whom, what departments exist, what roles people hold, their compensation, leave, and attendance. The critical insight here is that the org chart is a tree structure, and trees in relational databases require a self-referential relationship. This is the most important design decision in this schema and the one most candidates get wrong.
The read/write ratio is moderate. HR queries the org chart frequently. Payroll runs in bulk batches. Attendance and leave are write-heavy during business hours.
Entity Map
Full Schema with Comments
1CREATE TABLE departments (
2 id BIGINT AUTO_INCREMENT PRIMARY KEY,
3 name VARCHAR(100) NOT NULL UNIQUE,
4 -- UNIQUE: no two departments can have the same name
5 manager_id BIGINT NULL,
6 -- FK to employees — set AFTER employees table is created
7 -- NULL initially: a new department may not have a manager assigned yet
8 created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
9 updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
10);
11
12CREATE TABLE employees (
13 id BIGINT AUTO_INCREMENT PRIMARY KEY,
14 full_name VARCHAR(255) NOT NULL,
15 email VARCHAR(255) NOT NULL UNIQUE,
16 -- UNIQUE: one email per employee, enforced at DB level
17 phone VARCHAR(20),
18 date_of_birth DATE,
19 joined_on DATE NOT NULL,
20 -- DATE not DATETIME: we care about the joining day, not the hour
21 department_id BIGINT NULL,
22 -- NULL: employee may not yet be assigned to a department (e.g. on onboarding)
23 manager_id BIGINT NULL,
24 -- SELF-REFERENTIAL FK: points back to employees.id
25 -- NULL means this employee has no manager (i.e. they are the CEO / root node)
26 -- This single column models the entire org chart as a tree
27 employment_status ENUM('ACTIVE','RESIGNED','TERMINATED','ON_LEAVE') NOT NULL DEFAULT 'ACTIVE',
28 created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
29 updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
30 deleted_at DATETIME NULL,
31 -- Soft delete: terminated employees must remain for payroll history and audits
32
33 FOREIGN KEY (department_id) REFERENCES departments(id),
34 FOREIGN KEY (manager_id) REFERENCES employees(id)
35 -- manager_id FK references the SAME TABLE — this is a self-referential relationship
36 -- It models: every employee optionally reports to another employee
37);
38
39-- Add manager_id FK to departments now that employees table exists
40ALTER TABLE departments
41 ADD CONSTRAINT fk_dept_manager
42 FOREIGN KEY (manager_id) REFERENCES employees(id);
43
44CREATE TABLE roles (
45 id BIGINT AUTO_INCREMENT PRIMARY KEY,
46 title VARCHAR(100) NOT NULL UNIQUE,
47 -- e.g. 'Software Engineer', 'Senior Engineer', 'Engineering Manager'
48 description TEXT
49);
50
51-- M:M junction: an employee can hold multiple roles over time
52-- This table is the role HISTORY — not just current role
53CREATE TABLE employee_roles (
54 id BIGINT AUTO_INCREMENT PRIMARY KEY,
55 -- Surrogate PK: same employee can hold same role multiple times (re-promotion etc.)
56 employee_id BIGINT NOT NULL,
57 role_id BIGINT NOT NULL,
58 assigned_on DATE NOT NULL,
59 -- When this role assignment took effect
60 relieved_on DATE NULL,
61 -- NULL means this is the CURRENT role. NOT NULL means historical.
62 -- To get current role: WHERE relieved_on IS NULL
63 FOREIGN KEY (employee_id) REFERENCES employees(id),
64 FOREIGN KEY (role_id) REFERENCES roles(id)
65);
66
67-- Salary history: never update — always INSERT a new row on every change
68-- This gives you a full audit trail of every compensation change
69CREATE TABLE salaries (
70 id BIGINT AUTO_INCREMENT PRIMARY KEY,
71 employee_id BIGINT NOT NULL,
72 amount DECIMAL(12, 2) NOT NULL,
73 -- DECIMAL not FLOAT: financial amounts must never use floating-point arithmetic
74 -- DECIMAL(12,2) supports up to 9,999,999,999.99
75 currency CHAR(3) NOT NULL DEFAULT 'INR',
76 -- ISO 4217 currency code: INR, USD, EUR
77 effective_from DATE NOT NULL,
78 effective_to DATE NULL,
79 -- NULL means this is the current salary
80 -- To get current salary: WHERE effective_to IS NULL
81 reason VARCHAR(255),
82 -- 'Annual appraisal', 'Promotion', 'Market correction'
83 FOREIGN KEY (employee_id) REFERENCES employees(id)
84);
85
86CREATE TABLE leaves (
87 id BIGINT AUTO_INCREMENT PRIMARY KEY,
88 employee_id BIGINT NOT NULL,
89 leave_type ENUM('SICK','CASUAL','EARNED','MATERNITY','PATERNITY','UNPAID') NOT NULL,
90 start_date DATE NOT NULL,
91 end_date DATE NOT NULL,
92 status ENUM('PENDING','APPROVED','REJECTED','CANCELLED') NOT NULL DEFAULT 'PENDING',
93 reason TEXT,
94 approved_by BIGINT NULL,
95 -- FK to employees: the manager who approved this leave
96 -- NULL if status is PENDING or REJECTED
97 applied_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
98 updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
99 FOREIGN KEY (employee_id) REFERENCES employees(id),
100 FOREIGN KEY (approved_by) REFERENCES employees(id)
101);
102
103CREATE TABLE attendance (
104 id BIGINT AUTO_INCREMENT PRIMARY KEY,
105 employee_id BIGINT NOT NULL,
106 date DATE NOT NULL,
107 check_in DATETIME,
108 -- DATETIME: we need the exact time of check-in, not just the day
109 check_out DATETIME NULL,
110 -- NULL: employee has checked in but not yet checked out
111 source ENUM('BIOMETRIC','MANUAL','REMOTE') NOT NULL DEFAULT 'BIOMETRIC',
112 UNIQUE (employee_id, date),
113 -- One attendance record per employee per day — DB-enforced
114 FOREIGN KEY (employee_id) REFERENCES employees(id)
115);
116
117-- Indexes for the most common queries
118CREATE INDEX idx_employees_department ON employees(department_id);
119-- "All employees in department X" — used on every org chart query
120
121CREATE INDEX idx_employees_manager ON employees(manager_id);
122-- "All direct reports of manager X" — critical for org tree traversal
123
124CREATE INDEX idx_employee_roles_employee ON employee_roles(employee_id);
125CREATE INDEX idx_salaries_employee ON salaries(employee_id);
126CREATE INDEX idx_leaves_employee_status ON leaves(employee_id, status);
127-- Composite: "All pending leaves for employee X" is the most common query
128
129CREATE INDEX idx_attendance_employee_date ON attendance(employee_id, date);
130-- Composite: "Attendance for employee X on date Y" — already covered by UNIQUE constraint
131-- The UNIQUE constraint on (employee_id, date) implicitly creates this index in MySQL
Key Design Decisions
The Self-Referential Relationship — Deep Explanation
This is the concept most candidates miss entirely. When an interviewer asks you to design an org chart, the naive answer is to create a separate reporting_structure table with (employee_id, reports_to_id). This works but is unnecessary complexity. The cleaner answer is a self-referential FK.
To get all direct reports of Deepak (id=2):
To get the full hierarchy (recursive, using a CTE):
This recursive CTE walks the tree level by level — exactly how databases handle hierarchical data. Mention this in interviews and you will stand out immediately.
Indexing Strategy
Common Traps to Avoid
- Storing current role as a column on employees — Role changes over time. A single column can only hold the current role and loses history. Use employee_roles with a date range pattern instead.
- Using FLOAT for salary — Floating-point rounding errors corrupt financial data silently. Always use DECIMAL for money.
- Hard-deleting terminated employees — Payroll history, tax records, and audit trails depend on this data. Always soft delete.
- Forgetting the circular FK problem — departments and employees reference each other. Naive CREATE TABLE order will fail. Create both with NULL FKs first, then add constraints.
- No self-referential index on manager_id — Without idx_employees_manager, every "who reports to X" query full-scans the employees table.
- Storing org chart in a separate table unnecessarily — A self-referential manager_id column on employees is the standard, clean solution. A separate reporting_structure table adds complexity with no benefit for simple trees.
Interview Q&A
Q: How do you model the org chart in a relational database?
With a self-referential foreign key. The employees table has a manager_id column that references employees.id. The root node (CEO) has manager_id = NULL. This models a tree structure within a single table. For hierarchical queries, use a recursive CTE.
Q: How do you get all employees under a given manager recursively?
With a recursive CTE using WITH RECURSIVE. Start from the manager's row, then iteratively join employees whose manager_id matches any row already in the result set. Each iteration goes one level deeper in the tree.
Q: Why is DECIMAL used for salary instead of FLOAT?
FLOAT is a binary floating-point type. It cannot represent all decimal fractions exactly. 0.1 + 0.2 in FLOAT gives approximately 0.30000000000000004. DECIMAL stores exact decimal values. Any financial amount that will be used in arithmetic or compared for equality must use DECIMAL.
Q: How do you track salary history?
Salaries are stored as an append-only history table. Every change creates a new row with a new effective_from date. The current salary has effective_to = NULL. To query the salary on any historical date: WHERE effective_from <= target_date AND (effective_to IS NULL OR effective_to >= target_date).
Q: How do you handle the circular FK between departments and employees?
Create both tables with their mutual FK columns initially set as NULLable without constraints. Then use ALTER TABLE to add the FK constraints after both tables exist. This is a standard pattern for circular references.
Q: What does employment_status ENUM give you over a boolean is_active?
A boolean loses information. An employee could be inactive for different reasons — resigned voluntarily, terminated involuntarily, or currently on approved leave. The ENUM preserves the reason for the status, which matters for HR reporting, severance calculations, and rehire eligibility.
ANKI Cards
Q: What is a self-referential foreign key and when do you use it?
A: A FK where a column in a table references the primary key of the same table. Used to model tree or hierarchical structures within a single entity — org charts, category trees, comment threads with replies, file system directories.
Q: What is the effective_to IS NULL pattern?
A: A convention for storing temporal/historical data. Active records have effective_to = NULL. Closed/historical records have a real date in effective_to. Querying current state: WHERE effective_to IS NULL. Querying history on a specific date: WHERE effective_from <= :date AND (effective_to IS NULL OR effective_to >= :date).
Q: Why is DECIMAL used for financial amounts instead of FLOAT?
A: FLOAT is binary floating-point and cannot represent all decimal fractions exactly, causing silent rounding errors in arithmetic. DECIMAL stores exact decimal values. Any column used in financial calculations must use DECIMAL.