Cosmic Module

D

Qubits of DPK

April 11, 2026

Core DBMS

Domain Understanding

An Email Campaign system allows marketing teams to create campaigns, build subscriber lists, send bulk emails to segments of their audience, and track engagement — opens, clicks, unsubscribes, and bounces. The critical challenge here is that the same email can be sent to thousands of subscribers, and every interaction with every sent email must be tracked individually. This creates an enormous events table that dominates the schema design.
The second challenge is subscriber segmentation — a subscriber can belong to multiple lists, a campaign can target multiple lists, and a subscriber can have different statuses across different lists (subscribed to the Newsletter list but unsubscribed from the Promotions list). These nuances fundamentally shape the junction tables.
The system is write-heavy for events (every open, every click writes a row) and read-heavy for reporting (marketers query open rates, click rates, and campaign performance constantly).

Entity Map

javascript
QUBITS OF DPK
1organisations ──< users                  ← team members who manage campaigns
2organisations ──< subscribers            ← the audience
3organisations ──< lists                  ← segmented mailing lists
4lists ──< list_subscribers >── subscribers  ← M:M with status per list
5organisations ──< campaigns
6campaigns ──< campaign_lists >── lists     ← M:M: campaign targets multiple lists
7campaigns ──< sent_emails               ← one row per email sent to one subscriber
8sent_emails ──< email_events             ← open, click, bounce, unsubscribe events

Full Schema with Comments

sql
QUBITS OF DPK
1CREATE TABLE organisations (
2  id         BIGINT AUTO_INCREMENT PRIMARY KEY,
3  name       VARCHAR(255) NOT NULL,
4  domain     VARCHAR(255) UNIQUE,
5  -- The sending domain (e.g. zomato.com)
6  created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
7);
8
9CREATE TABLE users (
10  id              BIGINT AUTO_INCREMENT PRIMARY KEY,
11  organisation_id BIGINT NOT NULL,
12  email           VARCHAR(255) NOT NULL UNIQUE,
13  name            VARCHAR(255) NOT NULL,
14  role            ENUM('ADMIN','EDITOR','VIEWER') NOT NULL DEFAULT 'EDITOR',
15  created_at      DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
16  deleted_at      DATETIME NULL,
17  FOREIGN KEY (organisation_id) REFERENCES organisations(id)
18);
19
20-- Subscribers are the people who receive emails
21-- email is the natural unique identifier here
22CREATE TABLE subscribers (
23  id              BIGINT AUTO_INCREMENT PRIMARY KEY,
24  organisation_id BIGINT NOT NULL,
25  email           VARCHAR(255) NOT NULL,
26  first_name      VARCHAR(100),
27  last_name       VARCHAR(100),
28  global_status   ENUM('ACTIVE','UNSUBSCRIBED','BOUNCED','COMPLAINED') NOT NULL DEFAULT 'ACTIVE',
29  -- Global status = across ALL lists for this organisation
30  -- If BOUNCED or COMPLAINED, never send to this address again regardless of list status
31  custom_fields   JSON NULL,
32  -- Flexible key-value attributes per subscriber: {"city": "Mumbai", "tier": "premium"}
33  -- JSON for variable attributes avoids an EAV (Entity-Attribute-Value) anti-pattern
34  created_at      DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
35  updated_at      DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
36  UNIQUE (organisation_id, email),
37  -- Same email can be a subscriber in different organisations, but unique within one
38  FOREIGN KEY (organisation_id) REFERENCES organisations(id)
39);
40
41-- A list is a named segment of subscribers (e.g. "Newsletter", "Promotions", "VIP")
42CREATE TABLE lists (
43  id              BIGINT AUTO_INCREMENT PRIMARY KEY,
44  organisation_id BIGINT NOT NULL,
45  name            VARCHAR(255) NOT NULL,
46  description     TEXT,
47  created_at      DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
48  FOREIGN KEY (organisation_id) REFERENCES organisations(id)
49);
50
51-- M:M junction between lists and subscribers
52-- Carries per-list subscription status — a subscriber can be active in one list
53-- but unsubscribed from another
54CREATE TABLE list_subscribers (
55  id            BIGINT AUTO_INCREMENT PRIMARY KEY,
56  list_id       BIGINT NOT NULL,
57  subscriber_id BIGINT NOT NULL,
58  status        ENUM('SUBSCRIBED','UNSUBSCRIBED','PENDING') NOT NULL DEFAULT 'SUBSCRIBED',
59  -- PENDING: double opt-in confirmation not yet completed
60  subscribed_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
61  unsubscribed_at DATETIME NULL,
62  -- NULL while SUBSCRIBED. Set when status changes to UNSUBSCRIBED.
63  UNIQUE (list_id, subscriber_id),
64  -- A subscriber can only be in a list once
65  FOREIGN KEY (list_id)       REFERENCES lists(id),
66  FOREIGN KEY (subscriber_id) REFERENCES subscribers(id)
67);
68
69-- A campaign is a bulk email send to one or more lists
70CREATE TABLE campaigns (
71  id              BIGINT AUTO_INCREMENT PRIMARY KEY,
72  organisation_id BIGINT NOT NULL,
73  name            VARCHAR(255) NOT NULL,
74  subject         VARCHAR(500) NOT NULL,
75  -- Email subject line
76  from_name       VARCHAR(255) NOT NULL,
77  from_email      VARCHAR(255) NOT NULL,
78  -- Sending identity (can differ from organisation domain for sub-brands)
79  html_body       LONGTEXT NOT NULL,
80  -- Full HTML email body. LONGTEXT supports up to 4GB.
81  plain_body      TEXT,
82  -- Plain-text fallback for email clients that don't support HTML
83  status          ENUM('DRAFT','SCHEDULED','SENDING','SENT','PAUSED','CANCELLED') NOT NULL DEFAULT 'DRAFT',
84  scheduled_at    DATETIME NULL,
85  -- NULL for immediately sent campaigns
86  sent_at         DATETIME NULL,
87  -- Set when sending completes
88  created_by      BIGINT NOT NULL,
89  created_at      DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
90  updated_at      DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
91  FOREIGN KEY (organisation_id) REFERENCES organisations(id),
92  FOREIGN KEY (created_by)      REFERENCES users(id)
93);
94
95-- M:M: a campaign targets multiple lists; a list can be targeted by multiple campaigns
96CREATE TABLE campaign_lists (
97  campaign_id BIGINT NOT NULL,
98  list_id     BIGINT NOT NULL,
99  PRIMARY KEY (campaign_id, list_id),
100  FOREIGN KEY (campaign_id) REFERENCES campaigns(id),
101  FOREIGN KEY (list_id)     REFERENCES lists(id)
102);
103
104-- One row per email sent to one subscriber
105-- This is the core delivery record: did this specific person receive this campaign?
106CREATE TABLE sent_emails (
107  id            BIGINT AUTO_INCREMENT PRIMARY KEY,
108  campaign_id   BIGINT NOT NULL,
109  subscriber_id BIGINT NOT NULL,
110  status        ENUM('QUEUED','SENT','DELIVERED','BOUNCED','FAILED') NOT NULL DEFAULT 'QUEUED',
111  message_id    VARCHAR(255) UNIQUE,
112  -- Unique ID from the email delivery provider (SendGrid, SES, Mailgun)
113  -- Used to correlate incoming webhooks (delivery receipts, bounces) back to this row
114  sent_at       DATETIME NULL,
115  delivered_at  DATETIME NULL,
116  UNIQUE (campaign_id, subscriber_id),
117  -- A subscriber receives a campaign email exactly once
118  FOREIGN KEY (campaign_id)   REFERENCES campaigns(id),
119  FOREIGN KEY (subscriber_id) REFERENCES subscribers(id)
120);
121
122-- The most write-heavy table: every open, click, bounce, unsubscribe = one row
123-- For a campaign sent to 1 million subscribers, this table could get
124-- 5-10 million rows from that single campaign if open rate is 50% and click rate is 10%
125CREATE TABLE email_events (
126  id           BIGINT AUTO_INCREMENT PRIMARY KEY,
127  sent_email_id BIGINT NOT NULL,
128  event_type   ENUM('OPEN','CLICK','BOUNCE','UNSUBSCRIBE','SPAM_COMPLAINT') NOT NULL,
129  occurred_at  DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
130  ip_address   VARCHAR(45) NULL,
131  -- IPv4 (15 chars) or IPv6 (39 chars) — VARCHAR(45) covers both
132  user_agent   VARCHAR(500) NULL,
133  -- Browser/email client identifier for device analytics
134  click_url    VARCHAR(2000) NULL,
135  -- NULL for non-CLICK events. The specific URL clicked for CLICK events.
136  FOREIGN KEY (sent_email_id) REFERENCES sent_emails(id)
137  -- No UNIQUE constraint here: the same subscriber can open an email multiple times
138  -- Every open is a separate event row
139);
140
141-- Indexes for the most critical query patterns
142CREATE INDEX idx_sent_emails_campaign    ON sent_emails(campaign_id);
143-- "All sent emails for campaign X" — used to compute delivery stats
144CREATE INDEX idx_sent_emails_subscriber  ON sent_emails(subscriber_id);
145-- "All campaigns sent to subscriber X" — used for subscriber activity view
146CREATE INDEX idx_email_events_sent_email ON email_events(sent_email_id);
147-- "All events for this specific sent email" — opens and clicks per delivery
148CREATE INDEX idx_email_events_type_time  ON email_events(event_type, occurred_at);
149-- Used for time-series analytics: "how many opens per day for this campaign?"
150CREATE INDEX idx_subscribers_org_status  ON subscribers(organisation_id, global_status);
151-- "All active subscribers for this organisation" — used before every campaign send
152CREATE INDEX idx_list_subscribers_list   ON list_subscribers(list_id, status);
153-- "All subscribed members of list X" — used to build the recipient list

Key Design Decisions

Computing Campaign Analytics

Marketers need open rate, click rate, and bounce rate for every campaign. At a naive level, these require aggregating email_events on every dashboard load. For large campaigns, this is too slow. The production approach is a precomputed campaign_stats denormalized table, updated by the same background job that processes incoming webhook events.
sql
QUBITS OF DPK
1-- Example: compute open rate for campaign 101
2SELECT
3  COUNT(DISTINCT se.id) AS total_sent,
4  COUNT(DISTINCT CASE WHEN ee.event_type = 'OPEN' THEN se.id END) AS unique_opens,
5  ROUND(
6    COUNT(DISTINCT CASE WHEN ee.event_type = 'OPEN' THEN se.id END) * 100.0
7    / COUNT(DISTINCT se.id), 2
8  ) AS open_rate_pct
9FROM sent_emails se
10LEFT JOIN email_events ee ON se.id = ee.sent_email_id
11WHERE se.campaign_id = 101;
Running this live on a 1M-row email_events table for every dashboard load is impractical. A precomputed stats table is the production solution, updated as each webhook event arrives.

Indexing Strategy

Common Traps to Avoid

  • One status field on subscriber only — Subscribers need per-list status for selective unsubscribe, plus a global status for hard bounces/complaints that override all lists.
  • Storing open count as a counter columnopen_count INT on sent_emails creates race conditions under concurrent webhook events. Append-only events with COUNT queries (or precomputed stats) is safer and more flexible.
  • No sent_emails table — Without it, you cannot tell which subscribers received a campaign, and you cannot attribute engagement events to specific deliveries.
  • Forgetting message_id — Email providers identify deliveries by their own IDs in webhook payloads. Without message_id stored on sent_emails, you cannot correlate incoming webhooks to your database rows.
  • TEXT instead of LONGTEXT for HTML body — Formatted email HTML regularly exceeds 65KB. Always use LONGTEXT.

Interview Q&A

Q: How do you model the fact that a subscriber can unsubscribe from one list but stay on another?
Two-level status. list_subscribers.status tracks per-list subscription state. subscribers.global_status tracks account-level hard blocks like bounces and spam complaints. Before sending any email, check global_status first, then list status. A BOUNCED or COMPLAINED global status overrides any list-level subscription.
Q: How do you compute open rate for a campaign?
Count DISTINCT sent_email_id in email_events WHERE event_type = 'OPEN' divided by total sent_emails for the campaign. Unique opens (per subscriber) not total opens (which would count multiple opens). In production, precompute this into a campaign_stats table updated by the webhook processing job.
Q: Why does email_events have no UNIQUE constraint?
Because a subscriber can genuinely open an email multiple times on different devices or days. Each open is a distinct engagement event worth recording for analytics and behavioural scoring. A UNIQUE constraint would silently drop real events.
Q: What is message_id and why is it on sent_emails?
Email delivery providers (SendGrid, SES) assign their own internal ID to every email they send. They include this ID in all subsequent webhook notifications ("this email was opened", "this email bounced"). message_id on sent_emails is the foreign key that maps incoming provider webhooks back to the correct row in your database.
Q: How does your schema scale for a campaign sent to 10 million subscribers?
sent_emails gets 10M rows. email_events gets 30-50M rows from that one campaign. Both tables need partitioning by campaign_id or sent_at date range. Analytics queries should run against a precomputed campaign_stats table, not scan email_events live. For very high scale, email_events moves to a columnar store like ClickHouse or Redshift.

ANKI Cards

Q: In an email campaign schema, why are there two status fields: global_status on subscribers and status on list_subscribers?
A: list_subscribers.status handles selective unsubscription ("remove me from Promotions but not Newsletter"). global_status handles hard blocks that override all lists — if an email address has hard-bounced or been marked as spam, you must never send to it again regardless of list membership.
Q: What is the role of sent_emails in an email campaign schema?
A: It is the delivery record — one row per (campaign, subscriber) pair, proving that a specific subscriber received a specific campaign. It anchors all engagement events in email_events and stores the message_id needed to correlate incoming provider webhooks.
Q: Why is email_events an append-only table with no UNIQUE constraint on open events?
A: A subscriber can open an email multiple times. Each open is a real, trackable engagement event. UNIQUE would silently drop re-opens. Append-only is correct for event streams and is also the most write-efficient table design.